PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Denish Patel in OmniTI on 2014-12-22 at 21:04:22

It’s about a time to wrap up 2014 :)

This year, 2014, has been very successful and productive year for my career and professional development. I have got chance to speak at 5 conferences and a Postgres users group in Philly.  Yet again chance to attend Surge 2014 conference in Washington DC, hosted by my employer, OmniTI. Thanks to all conference committee members for inviting me for speaking and networking opportunities.

Further more, my team (database team) was nominated and won Silver medal in the Best Technology Team of the year category for the Stevie Awards.  Thanks a lot to everyone in the team for providing support to achieve this recognition.

Last but not the least, Postgres 9.4 was released last week. It is an awesome gift for us from Postgres community!  I’m looking forward to upgrade couple of large production systems  to Postgres 9.4 early next year.

Additionally, I’d like to push myself to involve more into supporting Postgres community over next  year(s). As of now, I will be speaking and/or attending Postgres Conf Russia & Postgres NYC conferences in early 2015. I’m hoping to write more blog posts on technologies in 2015.

I’m ready to Welcome 2015 :)

Thanks again for reading and your continuous support.

Posted by Robert Haas in EnterpriseDB on 2014-12-22 at 20:45:00
It's been over a year since I last blogged about parallelism, so I think I'm past due for an update, especially because some exciting things are happening.

First, Amit Kapila has published a draft patch for parallel sequential scan.  Many things remain to be improved about this patch, which is neither as robust as it needs to be nor as performant as we'd like it to be nor as well-modularized as it really should be.  But it exists, and it passes simple tests, and that is a big step forward.  Even better, on most of Amit's tests, it shows a very substantial speed-up over a non-parallel sequential scan.
Read more »
Posted by Greg Sabino Mullane in EndPoint on 2014-12-22 at 19:07:00
Bucardo is one of the trigger-based replication systems for Postgres (others include Slony and Londiste). All of these not only use triggers to gather information on what has changed, but they also disable triggers when copying things to remote databases. They do this to ensure that only the data itself gets copied, in as fast as manner as possible. This also has the effect of disabling foreign keys, which Postgres implements by use of triggers on the underlying tables. There are times, however, when you need a trigger on a target to fire (such as data masking). Here are four approaches to working around the disabling of triggers. The first two solutions will work with any replication system, but the third and fourth are specific to Bucardo.

First, let's understand how the triggers get disabled. A long time ago (Postgres 8.2 and older), triggers had to be disabled by direct changes to the system catalogs. Luckily, those days are over, and now this is done by issuing this command before copying any data:

SET session_replication_role = 'replica';

This prevents all normal triggers and rules from being activated. There are times, however, when you want certain triggers (or their effects) to execute during replication.

Let's use a simple hypothetical to illustrate all of these solutions. We will start with the Postgres built-in pgbench utility, The initialize option (-i) can be used to create and populate some tables:

$ createdb btest1
$ pgbench -i btest1
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.16 s, remaining 0.00 s).
vacuum...
set primary keys...
done.

We want to replicate all four of the tables pgbench just created. Bucardo requires that a table have a primary key or a unique index to be replicated, so we will need to make an i

[continue reading]

Only small notice - plpgsql_check is available from PGXN repository. More about PGXN on FAQ page.
Posted by Robins Tharakan on 2014-12-20 at 16:08:00
In a Product based company, at times DB Developers don't get direct access to production boxes. Instead, an IT resource ends up managing a large swathe of Web / DB boxes. Since such a resource generally has a large field of operation, at times they need quick steps to identify the sudden high-disk usage. In such a scenario (where Production is silo-ed out of DB Developers), correct triaging of
Posted by Luca Ferrari in ITPUG on 2014-12-20 at 07:48:00
This is my first attempt to appear on Planet PostgreSQL. I'm Luca, the current president of the Italian PostgreSQL Users' Group (ITPUG) and I'm a PostgreSQL addicted. Unluckily I'm currently not using PostgreSQL in a day-by-day job, but I'm following the evolution of the project and using it wherever is possible. I hope to able to contribute to the community somehow.
Posted by Andrew Dunstan in pgExperts on 2014-12-19 at 15:47:00
One of the things I was hoping would make it into PostgreSQL 9.4 and didn't was some extra statistics in pg_stat_statements. Specifically, it was proposed to add minimum, maximum and (after I proposed it and pressed the case) the standard deviation for execution times. Keeping these stats is very cheap, so there should be close to zero performance impact. They will give you a much better idea of the spread of execution times than a simple average, which might be significantly dominated by outlier times. Unfortunately, these got left out for 9.4, which is rather disappointing. I'm going to see if I can revive the proposal for 9.5.

I apologize for my cancelled post titled “Linux System Programming in C”. It is supposed to be posted under other category.

To alleviate database load, my first pick is pgpool-II with memory_cache_enabled turned on. Unfortunately, it does not cache query result from stored procedure.

The alternative is pgmemcache. It provides interface functions to memcached and can be called from within stored procedure.

Thanks to pgpool-II idea, SQL command and result is cached into memcached hash:

  • Taking md5 of SQL as key.
  • Forming set of records to string as value.

Assumption:

  • memcahed server is up and running.
  • pgmemcahe extension is installed in postgersql server.
  • postgresql.conf has been set to accordingly.

My case is query result for chart of account (coa) table joining 3 tables:

SELECT
  coa.id,
  tipe.label AS type_label,
  coa.kode,
  coa.label AS coa_label,
  dbcr.label AS dbcr_label
FROM acct.coa coa
INNER JOIN acct.tipe tipe ON coa.tipe=tipe.id
INNER JOIN acct.dbcr dbcr ON coa.dbcr=dbcr.id
OFFSET 0 LIMIT 25;

I wrap it insided plpgsql stored procedure. Hers is my custom data type:

CREATE TYPE acct.coa_cache AS
(i_id integer,
t_tipe text,
t_kode text,
t_label text,
t_dbcr text);

Stored Procedure:

CREATE OR REPLACE FUNCTION acct.coa_query_page_cache(OUT d acct.coa_cache)
RETURNS SETOF acct.coa_cache AS
$BODY$
DECLARE
  _sql text;
  _key text;
  _v text;
  _ds acct.coa_cache[];
  _i integer;
  _count integer;
BEGIN
  _sql:= 'SELECT coa.id'
    || ',tipe.label'
    || ',coa.kode'
    || ',coa.label'
    || ',dbcr.label'
    || ' FROM acct.coa coa'
    || ' INNER JOIN acct.tipe tipe ON coa.tipe=tipe.id'
    || ' INNER JOIN acct.dbcr dbcr ON coa.dbcr=dbcr.id'
    || ' OFFSET 0 LIMIT 10';
  _key:=md5(_sql); --taking md5 as key
  _v:=memcache_get(_key); --get hash value by key
  IF _v IS NOT NULL THEN --hit
    _ds:=NULLIF(_v,'')::acct.coa_cache[];--convert string to acct.coa_cache array
    _count:=COALESCE(ARRAY_UPPER(_ds,1),0);
    FOR _i IN 1.._count LOOP
      d:=(_ds[_i]);
      RETURN NEXT;
    END LOOP;
  ELSE --m

[continue reading]

Foreign tables are getting step-by-step closer to the features that are present in normal relations with the addition of support for CHECK constraints.

commit: fc2ac1fb41c2defb8caf825781af75db158fb7a9
author: Tom Lane <tgl@sss.pgh.pa.us>
date: Wed, 17 Dec 2014 17:00:53 -0500
Allow CHECK constraints to be placed on foreign tables.

As with NOT NULL constraints, we consider that such constraints are merely
reports of constraints that are being enforced by the remote server (or
other underlying storage mechanism).  Their only real use is to allow
planner optimizations, for example in constraint-exclusion checks.  Thus,
the code changes here amount to little more than removal of the error that
was formerly thrown for applying CHECK to a foreign table.

(In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page,
which had accumulated some weird decisions about ordering etc.)

Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and
Ashutosh Bapat

As the constraint evaluation is not done on the PostgreSQL side (it is the responsability of the remote source to perform the constraint checks), what is done here is more allowing systems to do consistent CHECK declarations on both the remote and local side. This is useful for the planner as it can take advantage of that by performing plan optimizations that get consistent plans across the remote and local sources, particularly in the case where the remote source is a PostgreSQL server itself.

This behavior is similar to NOT NULL, the constraint check being done on remote side. For example in the case of a single instance of Postgres linked to itself, even if the constraint is defined locally but not remotely there is nothing happening.

=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'localhost', port '5432', dbname 'postgres');
CREATE SERVER
=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server
OPTIONS (password '');
CREATE USER MAPPING
=# CREATE TABLE

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2014-12-19 at 00:09:00
PostgreSQL 9.4 is released today, and once again my main input has been in the JSON area. We've introduced a new spiffy JSON type called "jsonb" that makes performing operations on JSON much, much faster, because it's not stored as plain text, but rather decomposed into the structures and scalar values that go to make up JSON. Much of the credit for this goes to Oleg Bartunov and Teodor Sigaev, the original authors of the hstore module, who created a "nested hstore" module that we eventually based jsonb on. Credit also goes to Alexander Korotkov and Peter Geoghegan, especially for their indexing work, which will make many operations super fast. My role was to make sure that all (or almost all) of the operations that are available for the original json type are also available for the jsonb type.

There have been some benchmark tests that suggest that our jsonb is faster in many respects that other database engines that use json as their native storage.

The nice thing about this is that you can now have a highly performing mixture of traditional relational data and semi-structured treeish data with proper transactional semantics, all in one engine. That's a very attractive proposition for a lot of people looking to design new data stores.

My other JSON contribution was to add a number of functions that allow the construction of arbitrarily complex json documents, with data from the database. This will be particularly useful for clients that expect json in a particular, non-regular shape, and is based on the needs of some actual json-using customers.

Already committed for 9.5, to be released some time next year, are jsonb versions of those json functions we didn't provide jsonb versions of in 9.4, and functions to allow the stripping of object fields with 'null' as the value from json and jsonb values (this can result in much more compact storage in some cases, with no significant semantic loss).

I made a few other contributions to 9.4, but they are not worth mentioning in detail.

I'm now looking for new Postgr

[continue reading]

Posted by Josh Berkus in pgExperts on 2014-12-19 at 00:05:00
For the third night of Hanukkah, we have a nice present for you: PostgreSQL 9.4.0.  Now eat your latkes.



Let's talk about 9.4 by talking about one of my favorite features: JSONB.  A lot of people worked to bring you JSONB in this release, including my colleague Andrew Dunstan, Oleg Bartunov and Teodor Sigaev (sponsored by Engine Yard), Peter Geohegan of Heroku.com, and several others.

For example, imagine we have a large set of JSON documents which contain publication data about various books.  One reason we might have this is that we receive the data in JSON or HTML format from various publishers, and it is not normalized, so we keep it in its original format for data mining..  We can create store them in a JSON column like so:

    table booksdata (
        title citext not null,
        isbn isbn not null primary key,
        pubinfo jsonb not null
    )

Some example pubinfo might be:

    {"authority" : [ "James Michaels", "Nina Totenberg" ], "cost": 16, "edition" : 3,
     "format": "Trade PB", "publisher": "HarperOne",
     "published_on": "1995-08-05", "revised" : [ "1996-06-01", "1999-01-01" ] }

You'll note that the keys are already sorted in alphabetical order.  This is a feature of JSONB; whitespace gets cleaned up and keys get storted on import.

Then we can create a general index on the JSONB like so:

    CREATE INDEX ON booksdata USING GIN (pubinfo);

or:

    CREATE INDEX ON booksdata USING GIN (pubinfo json_path_ops);

There are two different versions depending on the operations you expect to run.  The standard GIN index supports every kind of operation we support for JSONB.  The path_ops index supports only the search path operator "@>" (see below), but produces a smaller and faster index for these kinds of searches.  However, we're going to do a bunch of different JSON operations for data mining, so we'll use the regular GIN index.

Once we have the index, we can do arbitrary path searches on JSONB, and these will be index searches, depending on what the planner picks.  Examples of JSONB qu

[continue reading]

Posted by Paul Ramsey on 2014-12-19 at 00:00:00

The last couple weeks have seen two interesting updates in the world of PostgreSQL “foreign data wrappers” (FDW). Foreign data wrappers allow you to access remote data inside your database, exactly like other tables. PostgreSQL ships with two example implementations, one for accessing text files, and the other for accessing remote PostgreSQL servers.

The two updates of interest to PostGIS users are:

  • The Oracle FDW implementation was recently enhanced to support spatial columns, so an Oracle table with SDO_GEOMETRY columns can be exposed in PostgreSQL as a table with PostGIS geometry columns.
  • A new OGR FDW implementation was released that supports exposing any OGR data source as a table with PostGIS geometry columns.

Now you can access your PostGIS data without even going to the trouble of importing it first!

Posted by Paul Ramsey on 2014-12-18 at 00:00:00

The 2.1.5 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.5 release. As befits a patch release, the focus is on bugs, breakages, and performance issues

http://download.osgeo.org/postgis/source/postgis-2.1.5.tar.gz

Continue Reading by clicking title hyperlink ..
Posted by gabrielle roth on 2014-12-16 at 01:04:00
I had reason recently to suspect that autovacuum jobs weren’t being properly logged on my RDS instance. Instead of compulsively re-running “SELECT relname, last_autovacuum FROM pg_stat_user_tables” while I waited for one to happen, I set up this quick test: connect to database and do the following: -- verify settings -- log_autovacuum_min_duration should be 0 ("log […]
Posted by Robins Tharakan on 2014-12-14 at 21:50:00
Recently, someone asked for some inputs before going ahead with a PostgreSQL Certification. From what I know, outside Japan, EnterpriseDB is the only Consultancy offer Certification services consistently over the past few years. Having recently cleared one from EnterpriseDB, I had a few things to say therein, and thought I'd rather enlist them as a full-blown post here. An important
Posted by Pavel Stehule on 2014-12-14 at 17:52:00
mscottie found a magic option -undefined dynamic_lookup. With this option we are able to compile plpgsql_check on OS X platform.
Every new release of PostgreSQL comes with a packed of exciting features. To benefit new features, database server should be upgraded. Choosing traditional upgrade paths like pg_dump/pg_restore or pg_upgrade requires a significant downtime of application. Today, if you are looking for minimum downtime upgrade path among major PostgreSQL versions with perfect rollback plan, then it will be accomplished by asynchronous Slony-I replication. Since Slony-I (know more about it here) has the capability to replicate between different PostgreSQL versions,OS and bit architectures easily, so upgrades are doable without requiring a substantial downtime. In addition, it has a consistent switchover and switchback functionality in its design.

IMO, while doing major version upgrades there should be a proper fallback plan because just in case application turn out to buggy or fail to perform well on upgraded version, then we should be able to rollback to older version immediately. Slony-I provides such functionality in the way of switchback. This post demonstrates, minimum downtime upgradation including switchover/switchback steps.

Before going to demo, one important step to be noted, earlier to PG 9.0.x version bytea datatype columns use to store data in ESCAPE format and later version its in HEX format. While performing switchback (newer version to older version), this kind of bytea format differences are not support by Slony-I hence ESCAPE format should be maintained through out upgrade duration, else you may encounter with an error:
ERROR  remoteWorkerThread_1_1: error at end of COPY IN: ERROR:  invalid input syntax for type bytea
CONTEXT: COPY sl_log_1, line 1: "1 991380 1 100001 public foo I 0 {id,500003,name,"A ",b,"\\\\x41"}"
ERROR remoteWorkerThread_1: SYNC aborted
To fix, no changes required on PG 8.4.x but on PG 9.3.5 bytea_output parameter should be set from HEX to ESCAPE as shown. We can set it at cluster-level ($PGDATA/postgresql.conf) or user-level (ALTER TABLE...SET), I have pre

[continue reading]

Posted by Ian Barwick on 2014-12-11 at 21:48:00

Japan has been an early and vigorous adopter of PostgreSQL (back in 2006, when PostgreSQL was still...

permalink

Posted by Josh Berkus in pgExperts on 2014-12-11 at 07:40:00
Of course, the first thing which occurred to me with pg_shard was to create a distributed "NoSQL" database using 9.4's JSONB.  I had to wait a bit because there were some issues with 9.4 compatibility which needed resolving, but this week I was able to build it.  I was hoping to show it off at the SFPUG meeting, but sadly the weather gods (in the form of the PostgreSQL-powered NWS) interfered.

So we'll go over it here because it makes a good introduction to pg_shard and to JSONB.

First step was to spin up an AWS m3.xlarge instance to be our master.  While I don't need much in other capabilities, I do want the master node to have AWS's best network option since it'll be doing a LOT of networking.  Do, however, make sure to configure it with instance storage because you'll want that space for temp tables and any tables which aren't sharded.  This also becomes our template for the shards, because it's easier that way.  I created the instance running Ubuntu 14.04.  Then I had to install all of the PostgreSQL 9.4 components and a few others, including:

  • postgresql-9.4, postgresql-contrib-9.4, postgresql-server-dev-9.4
  • python-psycopg2 (for my benchmark test)
  • git, make

After that I "git cloned" the pg_shard repo and did a make and make install in it.  Currently it emits some spurious warnings with 9.4; ignore those.

Next was configuring PostgreSQL for pg_shard.  This is the master node, so it's where we're going to use pg_shard.  In addition to the usual PostgreSQL configuration, you want to change two settings:

shared_preload_libraries = 'pg_shard' #load pg_shard
temp_buffers = 64MB #pg_shard uses temp tables internally


Then start (or restart) PostgreSQL.  After that, we create a "psocial" database to represent our dummy social networking application, and set up a .pgpass file and an authorized_hosts file for passwordless access by the postgres user.  Then we stop, and shut PostgreSQL down. and use the AWS API to create eight r3.large instances based on that master instance to be our shards.  This way they alr

[continue reading]

Posted by Ian Barwick in 2ndQuadrant on 2014-12-11 at 04:51:04

jpug-conference-2014-sponsors-smallJapan has been an early and vigorous adopter of PostgreSQL (back in 2006, when PostgreSQL was still emerging from obscurity in the western hemisphere, I noted that in Tokyo bookstores, PostgreSQL books outweighed MySQL ones by about 5:3), and it’s no surprise that by nationality, only the USA and Germany have more committers. The Japan PostgreSQL User Group (JPUG) has been around a while too and is one of the oldest and most active worldwide (and has been around long enough to have established a Japanese logo for PostgreSQL which is often used in place of Slonik [*]) . This year JPUG celebrates its 15th anniversary, and the 2014 conference – held in Tokyo on December 5th – was the largest yet with over 20 speakers, 5 (five!) parallel tracks and around 280 attendees.

Continuing a tradition started in 2012, the keynote speaker has been a foreign guest, and this year’s speaker was 2ndQuadrant’s very own Simon Riggs, who gave a well-received talk about the open source development process and PostgreSQL. (The interpreter did a fantastic job despite the occasional idomatic diversion from the script).

A key theme of this year’s conference was international exchange in general, and Asia in particular. As you can see from the schedule below, apart from the keynote speech there was an international track (rightmost column) with talks in English by speakers from a variety of Asian countries, and it’s hoped this will develop into a PgCon Asia.

jpug-conference-2014-sessions-smallAnother theme to emerge was the need for increased communication between the Japanese PostgreSQL community and the rest of the planet – Japan is home to many interesting developments which often go unnoticed in the wider world – and 2ndQuadrant hopes to play a part in improving their visibility.

Finally, no modern conference is complete without lightning talks, which this year took place during the post-conference party. By an ancient Japanese tradition dating back to the earlier part of this year, each speaker (including yours truly, who burbled on about the ideas contained in the p

[continue reading]

Posted by Dan Robinson on 2014-12-10 at 17:45:57

Did you know PostgreSQL supports indexing a subset of your table? This enables very fast reads from that subset with almost no index overhead. It’s often the best way to index your data if you want to repeatedly analyze rows that match a given WHERE clause. This makes PostgreSQL a great fit for workflows that involve pre-aggregation with additional ad hoc analysis. In this post, I’ll walk through an example query optimization in which partial indexes are well suited.

Consider a table of events with the following schema:

Each event is associated with a user and has an ID, a timestamp, and a JSON representation of the event. The JSON includes the page path, the event type (e.g. click, page view, form submission), and any other properties that describe the event.

We can use this table to store many different kinds of events, and let’s assume we have some automatic event tracking in place that logs every click, page view, and form submission so that we can analyze it later. We might want to have an internal dashboard that displays some high-value metrics, such as the number of weekly signups or the amount of revenue we are collecting per day. The events that are relevant to this dashboard make up a small fraction of this table — a very small percentage of clicks on your website are to finalize purchases! But they are mixed in with the rest of the table, so our signal-to-noise ratio is low.

We would like to index our data to make these dashboard queries fast.[1] Let’s start with a signup event, which we define to be a form submission on our /signup/ page. Getting the number of such signups in the first week of September translates to:

On a synthetic dataset with 10 million events, of which 3000 are signups, and without any indexes, this query takes 45 seconds.

Full Indexes On Single Columns: A Mixed Bag

A naive way to improve this performance is by creating single-column indexes for each of the relevant event features: (data->>'type'), (data->>'path'), and time. We can use a bitmap join between results from three i

[continue reading]

On 8th of December, Simon Riggs committed patch: REINDEX SCHEMA   Add new SCHEMA option to REINDEX and reindexdb.   Sawada Masahiko   Reviewed by Michael Paquier and Fabrízio de Royes Mello This is very simple thing, so the description will not be long. But – I'm mostly a DBA, so all things that give […]
Posted by Josh Berkus in pgExperts on 2014-12-09 at 06:09:00
One of the limitations with both RDS and Heroku is that you can only install the extensions they've approved and made available to you.  Except ... you can install "SQL-only" extensions which don't have a binary component.  Like pg_partman.

pg_partman is a set of functions and tables which help you manage your partitioned tables, including creating new partitions and triggers on the partition master.  I love it because it means I never have to write ad-hoc code for a client to do that again.  So of course I want to install it on our clients who are running on Heroku and RDS and need partitioning.  However, I have no access to "create extension partman" on the RDS command line, so how do I make it work?  Mostly by loading the extension the old way.

1) create a new schema on the target database called "partman".

2) Install PostgreSQL 9.3 on my laptop, including the development files.

3) git clone pg_partman.

4) make & make install pg_partman on my laptop.  Inside the pg_partman source directory, that will create a file in the "sql" directory called (currently): pg_partman--1.7.2.sql.

5) Since that file is designed to be installed as an extension, I need to alter it.  First, I replace all occurances of "@extschema@" with "partman".  Second, I delete the line "SELECT pg_catalog.pg_extension_config_dump('part_config', '');"

6) Load partman into your database by using "psql -f  pg_partman--1.7.2.sql YOUR_DB_NAME -h YOUR_HOST".  In the RDS case, this needs to be done as the instance superuser.

pg_partman can now be used in the target database by calling all of the functions as "partman.function", or by putting the partman schema in your search_path.

Now, this has the major drawback that upgrading pg_partman once 1.7.3 comes out will be a very manual process.  But that's better than not using it at all.



Posted by gabrielle roth on 2014-12-09 at 04:27:00
For the last PDXPUG lab of the year, we tried out BiDirectional Replicatoin (BDR). Five of us just set up VMs on our laptops and followed the instructions on the wiki. We only had about 90 minutes time for this lab, so the goal was to get a basic configuration up & running, understand the […]

PostgreSQL 9.5 has added a new mode for REINDEX with this commit:

commit: fe263d115a7dd16095b8b8f1e943aff2bb4574d2
author: Simon Riggs <simon@2ndQuadrant.com>
date: Tue, 9 Dec 2014 00:28:00 +0900
REINDEX SCHEMA

Add new SCHEMA option to REINDEX and reindexdb.

Sawada Masahiko

Reviewed by Michael Paquier and Fabrízio de Royes Mello

Up to 9.4, REINDEX is able to run on different things:

  • INDEX, to reindex a given index.
  • TABLE, to reindex entirely a table, including its underlying toast index if there is a toast table on it.
  • DATABASE, to reindex all the relations of the database user is currently connected to, including catalog tables.
  • SYSTEM, to reindex all the system catalogs.

A couple of things to note though:

  • Indexes reindexed are locked with a strong exclusive lock, preventing any other session to touch it.
  • Parent tables are locked with a share lock
  • DATABASE and SYSTEM cannot run in a transaction, and process tables one-by-one, pg_class being run first as REINDEX updates it (this last point is an implementation detail, not mentioned in the docs).

The new mode for SCHEMA mixes those things, but behaves similarly to DATABASE and SYSTEM, for a schema:

  • It cannot run in a transaction.
  • Each table of the schema is processed one-by-one
  • pg_class is reindexed first only if pg_catalog is processed.

That's actually what you can find here, note first that pg_class is at the top of the relations indexed.

=# REINDEX SCHEMA pg_catalog;
NOTICE:  00000: table "pg_catalog.pg_class" was reindexed
LOCATION:  ReindexObject, indexcmds.c:1942
[...]
NOTICE:  00000: table "pg_catalog.pg_depend" was reindexed
LOCATION:  ReindexObject, indexcmds.c:1942
REINDEX

And that this operation is non-transactional:

=# BEGIN;
BEGIN
=# REINDEX SCHEMA pg_catalog;
ERROR:  25001: REINDEX SCHEMA cannot run inside a transaction block
LOCATION:  PreventTransactionChain, xact.c:2976

A last thing to note is that a user that has no access on a schema will logically not be able to run REINDEX on it.

=# CREATE USER foo;
CREATE ROLE
=# SET SE

[continue reading]

On 7th of December, Simon Riggs committed patch: Event Trigger for table_rewrite   Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason.   Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. […]
Posted by Craig Ringer in 2ndQuadrant on 2014-12-08 at 03:27:44

For a couple of years now a team at 2ndQuadrant led by Andres Freund have been working on adding bi-directional asynchronous multi-master replication support for PostgreSQL. This effort has become known as the BDR project.

We’re really excited to see these efforts leading to new PostgreSQL features and have a great deal more still to come.

Incremental Development

As a large development project it is neither practical nor desirable to deliver all the changes to PostgreSQL as a single huge patch. That way lies madness and unmaintainable code. It would also be resoundingly rejected by the PostgreSQL community, as it should be.

Instead, BDR has been developed as a series of discrete changes to core PostgreSQL, plus an extension that uses those core changes to implement multi-master asynchronous logical replication.

To make this separation clearer we’ve split BDR into two separate trees since the last official release. We now manage and develop the set of patches to PostgreSQL separately from the BDR extension its self.

BDR is destined for community PostgreSQL

2ndQuadrant’s goal is to get all of BDR – the changes to the core server as well as the extension or its functionality – included in a future open source community PostgreSQL release.

When that’s achieved you’ll be able to use all the features currently provided by the BDR patches and extension with no 3rd party (open source or otherwise) patches, extensions and tools.

All of BDR is open source, released under the PostgreSQL license, and is developed on PostgreSQL’s public infrastructure. The git repository containing the authoritative current version of BDR is on git.postgresql.org.

Getting BDR into users’ hands

With a project as big as BDR and with the PostgreSQL project’s safety-first, conservative and careful development style it’ll be a long time before all of BDR is included in community PostgreSQL. As a result we’ve had to make some compromises in order to get BDR functionality out to users prior to the expected end of the 9.5/9.6 release cycles in a cou

[continue reading]

Posted by Dave Page in EnterpriseDB on 2014-12-07 at 12:30:00
I’m often asked how I first became involved in PostgreSQL, and how the pgAdmin project got started. Much as I’m happy to tell the story over beer, it becomes tedious to do so over email after the first half-dozen or so times. So in a vain attempt to save my fingers from future carpal tunnel syndrome, here’s the tale…

I first discovered PostgreSQL in 1997. I was working at a social housing provider in the UK where, at the time, we were running Informix SE and Pick databases. We wanted to start building in-house web applications for managing additional areas of the business. There were no useful interfaces for Pick that could be used with any of the web technologies available to us at the time and the licensing model for Informix and its ODBC driver was prohibitively expensive, so I started looking for alternatives. 

After spending some time researching mSQL and MySQL, I eventually found PostgreSQL, which seemed to offer everything we needed—a SQL-based DBMS with ODBC and C interfaces, a liberal license, a rich set of features, and a vibrant community supporting it that was responsive to bug reports and open to feature requests. My colleagues and I developed a number of applications using PostgreSQL in the following years and the database became a critical part of our business. We had applications written in PHP for the web as well as Visual Basic for the desktop users. 

However, it was early on in our journey with Postgres that, as a Windows shop (on the desktop at least), we found ourselves wishing for a good graphical management tool for the database. At the time, the only option was pgAccess, which was a basic tool written in TCL/TK that was not easy to get to work on Windows. So I decided to write my own in Visual Basic. The first version was never released to the public and was modeled on the pgAccess design. Called pgManager, it was quickly abandoned as we found the UI to be restrictive and, well, not overly usable. The second incarnation was pgAdmin, which was redesigned from the ground up. It was relea

[continue reading]

Posted by Hubert 'depesz' Lubaczewski on 2014-12-06 at 18:47:33
The general knowledge is that numerics are slower than integers/float, but offer precision and ranges that are better. While I understand what is slow, I don't really know how much slower numerics are. So let's test it. There are couple of things to consider: table size (related to disk usage per column) speed of inserts […]

Oracle FDW is a foreign data wrapper PostgreSQL extension that allows you to read Oracle database tables from a PostgreSQL database. You can get it via the PGXN network or the main website http://laurenz.github.io/oracle_fdw/.

What is new about the latest 1.1.0 release is that there is now support for the Oracle SDO_GEOMETRY type that allows you to map the most common geometry types POINT, LINE, POLYGON, MULTIPOINT, MULTILINE and MULTIPOLYGON to PostGIS geometry type. Much of the spatial plumbing work was done by Vincent Mora of Oslandia. If we have any Windows Oracle users out there, yes there are binaries available for windows for PostgreSQL 9.1- 9.4 for both 32-bit and 64-bit. The FDW does have a dependency on the OCI.dll which I think comes shipped with Oracle products. Unfortunately, we are not Oracle database users so can't kick the tires on this one.