The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Only small notice - plpgsql_check is available from PGXN repository. More about PGXN on FAQ page.
Posted by Robins Tharakan on 2014-12-20 at 16:08:00
In a Product based company, at times DB Developers don't get direct access to production boxes. Instead, an IT resource ends up managing a large swathe of Web / DB boxes. Since such a resource generally has a large field of operation, at times they need quick steps to identify the sudden high-disk usage. In such a scenario (where Production is silo-ed out of DB Developers), correct triaging of
Posted by Andrew Dunstan in pgExperts on 2014-12-19 at 15:47:00
One of the things I was hoping would make it into PostgreSQL 9.4 and didn't was some extra statistics in pg_stat_statements. Specifically, it was proposed to add minimum, maximum and (after I proposed it and pressed the case) the standard deviation for execution times. Keeping these stats is very cheap, so there should be close to zero performance impact. They will give you a much better idea of the spread of execution times than a simple average, which might be significantly dominated by outlier times. Unfortunately, these got left out for 9.4, which is rather disappointing. I'm going to see if I can revive the proposal for 9.5.

I apologize for my cancelled post titled “Linux System Programming in C”. It is supposed to be posted under other category.

To alleviate database load, my first pick is pgpool-II with memory_cache_enabled turned on. Unfortunately, it does not cache query result from stored procedure.

The alternative is pgmemcache. It provides interface functions to memcached and can be called from within stored procedure.

Thanks to pgpool-II idea, SQL command and result is cached into memcached hash:

  • Taking md5 of SQL as key.
  • Forming set of records to string as value.


  • memcahed server is up and running.
  • pgmemcahe extension is installed in postgersql server.
  • postgresql.conf has been set to accordingly.

My case is query result for chart of account (coa) table joining 3 tables:

  tipe.label AS type_label,
  coa.label AS coa_label,
  dbcr.label AS dbcr_label
FROM acct.coa coa
INNER JOIN acct.tipe tipe ON
INNER JOIN acct.dbcr dbcr ON

I wrap it insided plpgsql stored procedure. Hers is my custom data type:

CREATE TYPE acct.coa_cache AS
(i_id integer,
t_tipe text,
t_kode text,
t_label text,
t_dbcr text);

Stored Procedure:

CREATE OR REPLACE FUNCTION acct.coa_query_page_cache(OUT d acct.coa_cache)
RETURNS SETOF acct.coa_cache AS
  _sql text;
  _key text;
  _v text;
  _ds acct.coa_cache[];
  _i integer;
  _count integer;
  _sql:= 'SELECT'
    || ',tipe.label'
    || ',coa.kode'
    || ',coa.label'
    || ',dbcr.label'
    || ' FROM acct.coa coa'
    || ' INNER JOIN acct.tipe tipe ON'
    || ' INNER JOIN acct.dbcr dbcr ON'
    || ' OFFSET 0 LIMIT 10';
  _key:=md5(_sql); --taking md5 as key
  _v:=memcache_get(_key); --get hash value by key
  IF _v IS NOT NULL THEN --hit
    _ds:=NULLIF(_v,'')::acct.coa_cache[];--convert string to acct.coa_cache array
    FOR _i IN 1.._count LOOP
  ELSE --m

[continue reading]

Foreign tables are getting step-by-step closer to the features that are present in normal relations with the addition of support for CHECK constraints.

commit: fc2ac1fb41c2defb8caf825781af75db158fb7a9
author: Tom Lane <>
date: Wed, 17 Dec 2014 17:00:53 -0500
Allow CHECK constraints to be placed on foreign tables.

As with NOT NULL constraints, we consider that such constraints are merely
reports of constraints that are being enforced by the remote server (or
other underlying storage mechanism).  Their only real use is to allow
planner optimizations, for example in constraint-exclusion checks.  Thus,
the code changes here amount to little more than removal of the error that
was formerly thrown for applying CHECK to a foreign table.

(In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page,
which had accumulated some weird decisions about ordering etc.)

Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and
Ashutosh Bapat

As the constraint evaluation is not done on the PostgreSQL side (it is the responsability of the remote source to perform the constraint checks), what is done here is more allowing systems to do consistent CHECK declarations on both the remote and local side. This is useful for the planner as it can take advantage of that by performing plan optimizations that get consistent plans across the remote and local sources, particularly in the case where the remote source is a PostgreSQL server itself.

This behavior is similar to NOT NULL, the constraint check being done on remote side. For example in the case of a single instance of Postgres linked to itself, even if the constraint is defined locally but not remotely there is nothing happening.

=# CREATE EXTENSION postgres_fdw;
=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'localhost', port '5432', dbname 'postgres');
OPTIONS (password '');

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2014-12-19 at 00:09:00
PostgreSQL 9.4 is released today, and once again my main input has been in the JSON area. We've introduced a new spiffy JSON type called "jsonb" that makes performing operations on JSON much, much faster, because it's not stored as plain text, but rather decomposed into the structures and scalar values that go to make up JSON. Much of the credit for this goes to Oleg Bartunov and Teodor Sigaev, the original authors of the hstore module, who created a "nested hstore" module that we eventually based jsonb on. Credit also goes to Alexander Korotkov and Peter Geoghegan, especially for their indexing work, which will make many operations super fast. My role was to make sure that all (or almost all) of the operations that are available for the original json type are also available for the jsonb type.

There have been some benchmark tests that suggest that our jsonb is faster in many respects that other database engines that use json as their native storage.

The nice thing about this is that you can now have a highly performing mixture of traditional relational data and semi-structured treeish data with proper transactional semantics, all in one engine. That's a very attractive proposition for a lot of people looking to design new data stores.

My other JSON contribution was to add a number of functions that allow the construction of arbitrarily complex json documents, with data from the database. This will be particularly useful for clients that expect json in a particular, non-regular shape, and is based on the needs of some actual json-using customers.

Already committed for 9.5, to be released some time next year, are jsonb versions of those json functions we didn't provide jsonb versions of in 9.4, and functions to allow the stripping of object fields with 'null' as the value from json and jsonb values (this can result in much more compact storage in some cases, with no significant semantic loss).

I made a few other contributions to 9.4, but they are not worth mentioning in detail.

I'm now looking for new Postgr

[continue reading]

Posted by Josh Berkus in pgExperts on 2014-12-19 at 00:05:00
For the third night of Hanukkah, we have a nice present for you: PostgreSQL 9.4.0.  Now eat your latkes.

Let's talk about 9.4 by talking about one of my favorite features: JSONB.  A lot of people worked to bring you JSONB in this release, including my colleague Andrew Dunstan, Oleg Bartunov and Teodor Sigaev (sponsored by Engine Yard), Peter Geohegan of, and several others.

For example, imagine we have a large set of JSON documents which contain publication data about various books.  One reason we might have this is that we receive the data in JSON or HTML format from various publishers, and it is not normalized, so we keep it in its original format for data mining..  We can create store them in a JSON column like so:

    table booksdata (
        title citext not null,
        isbn isbn not null primary key,
        pubinfo jsonb not null

Some example pubinfo might be:

    {"authority" : [ "James Michaels", "Nina Totenberg" ], "cost": 16, "edition" : 3,
     "format": "Trade PB", "publisher": "HarperOne",
     "published_on": "1995-08-05", "revised" : [ "1996-06-01", "1999-01-01" ] }

You'll note that the keys are already sorted in alphabetical order.  This is a feature of JSONB; whitespace gets cleaned up and keys get storted on import.

Then we can create a general index on the JSONB like so:

    CREATE INDEX ON booksdata USING GIN (pubinfo);


    CREATE INDEX ON booksdata USING GIN (pubinfo json_path_ops);

There are two different versions depending on the operations you expect to run.  The standard GIN index supports every kind of operation we support for JSONB.  The path_ops index supports only the search path operator "@>" (see below), but produces a smaller and faster index for these kinds of searches.  However, we're going to do a bunch of different JSON operations for data mining, so we'll use the regular GIN index.

Once we have the index, we can do arbitrary path searches on JSONB, and these will be index searches, depending on what the planner picks.  Examples of JSONB qu

[continue reading]

Posted by Paul Ramsey on 2014-12-19 at 00:00:00

The last couple weeks have seen two interesting updates in the world of PostgreSQL “foreign data wrappers” (FDW). Foreign data wrappers allow you to access remote data inside your database, exactly like other tables. PostgreSQL ships with two example implementations, one for accessing text files, and the other for accessing remote PostgreSQL servers.

The two updates of interest to PostGIS users are:

  • The Oracle FDW implementation was recently enhanced to support spatial columns, so an Oracle table with SDO_GEOMETRY columns can be exposed in PostgreSQL as a table with PostGIS geometry columns.
  • A new OGR FDW implementation was released that supports exposing any OGR data source as a table with PostGIS geometry columns.

Now you can access your PostGIS data without even going to the trouble of importing it first!

Posted by Paul Ramsey on 2014-12-18 at 00:00:00

The 2.1.5 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.5 release. As befits a patch release, the focus is on bugs, breakages, and performance issues

Continue Reading by clicking title hyperlink ..
Posted by gabrielle roth on 2014-12-16 at 01:04:00
I had reason recently to suspect that autovacuum jobs weren’t being properly logged on my RDS instance. Instead of compulsively re-running “SELECT relname, last_autovacuum FROM pg_stat_user_tables” while I waited for one to happen, I set up this quick test: connect to database and do the following: -- verify settings -- log_autovacuum_min_duration should be 0 ("log […]
Posted by Robins Tharakan on 2014-12-14 at 21:50:00
Recently, someone asked for some inputs before going ahead with a PostgreSQL Certification. From what I know, outside Japan, EnterpriseDB is the only Consultancy offer Certification services consistently over the past few years. Having recently cleared one from EnterpriseDB, I had a few things to say therein, and thought I'd rather enlist them as a full-blown post here. An important
Posted by Pavel Stehule on 2014-12-14 at 17:52:00
mscottie found a magic option -undefined dynamic_lookup. With this option we are able to compile plpgsql_check on OS X platform.
Every new release of PostgreSQL comes with a packed of exciting features. To benefit new features, database server should be upgraded. Choosing traditional upgrade paths like pg_dump/pg_restore or pg_upgrade requires a significant downtime of application. Today, if you are looking for minimum downtime upgrade path among major PostgreSQL versions with perfect rollback plan, then it will be accomplished by asynchronous Slony-I replication. Since Slony-I (know more about it here) has the capability to replicate between different PostgreSQL versions,OS and bit architectures easily, so upgrades are doable without requiring a substantial downtime. In addition, it has a consistent switchover and switchback functionality in its design.

IMO, while doing major version upgrades there should be a proper fallback plan because just in case application turn out to buggy or fail to perform well on upgraded version, then we should be able to rollback to older version immediately. Slony-I provides such functionality in the way of switchback. This post demonstrates, minimum downtime upgradation including switchover/switchback steps.

Before going to demo, one important step to be noted, earlier to PG 9.0.x version bytea datatype columns use to store data in ESCAPE format and later version its in HEX format. While performing switchback (newer version to older version), this kind of bytea format differences are not support by Slony-I hence ESCAPE format should be maintained through out upgrade duration, else you may encounter with an error:
ERROR  remoteWorkerThread_1_1: error at end of COPY IN: ERROR:  invalid input syntax for type bytea
CONTEXT: COPY sl_log_1, line 1: "1 991380 1 100001 public foo I 0 {id,500003,name,"A ",b,"\\\\x41"}"
ERROR remoteWorkerThread_1: SYNC aborted
To fix, no changes required on PG 8.4.x but on PG 9.3.5 bytea_output parameter should be set from HEX to ESCAPE as shown. We can set it at cluster-level ($PGDATA/postgresql.conf) or user-level (ALTER TABLE...SET), I have pre

[continue reading]

Posted by Ian Barwick on 2014-12-11 at 21:48:00

Japan has been an early and vigorous adopter of PostgreSQL (back in 2006, when PostgreSQL was still...


Posted by Josh Berkus in pgExperts on 2014-12-11 at 07:40:00
Of course, the first thing which occurred to me with pg_shard was to create a distributed "NoSQL" database using 9.4's JSONB.  I had to wait a bit because there were some issues with 9.4 compatibility which needed resolving, but this week I was able to build it.  I was hoping to show it off at the SFPUG meeting, but sadly the weather gods (in the form of the PostgreSQL-powered NWS) interfered.

So we'll go over it here because it makes a good introduction to pg_shard and to JSONB.

First step was to spin up an AWS m3.xlarge instance to be our master.  While I don't need much in other capabilities, I do want the master node to have AWS's best network option since it'll be doing a LOT of networking.  Do, however, make sure to configure it with instance storage because you'll want that space for temp tables and any tables which aren't sharded.  This also becomes our template for the shards, because it's easier that way.  I created the instance running Ubuntu 14.04.  Then I had to install all of the PostgreSQL 9.4 components and a few others, including:

  • postgresql-9.4, postgresql-contrib-9.4, postgresql-server-dev-9.4
  • python-psycopg2 (for my benchmark test)
  • git, make

After that I "git cloned" the pg_shard repo and did a make and make install in it.  Currently it emits some spurious warnings with 9.4; ignore those.

Next was configuring PostgreSQL for pg_shard.  This is the master node, so it's where we're going to use pg_shard.  In addition to the usual PostgreSQL configuration, you want to change two settings:

shared_preload_libraries = 'pg_shard' #load pg_shard
temp_buffers = 64MB #pg_shard uses temp tables internally

Then start (or restart) PostgreSQL.  After that, we create a "psocial" database to represent our dummy social networking application, and set up a .pgpass file and an authorized_hosts file for passwordless access by the postgres user.  Then we stop, and shut PostgreSQL down. and use the AWS API to create eight r3.large instances based on that master instance to be our shards.  This way they alr

[continue reading]

Posted by Ian Barwick in 2ndQuadrant on 2014-12-11 at 04:51:04

jpug-conference-2014-sponsors-smallJapan has been an early and vigorous adopter of PostgreSQL (back in 2006, when PostgreSQL was still emerging from obscurity in the western hemisphere, I noted that in Tokyo bookstores, PostgreSQL books outweighed MySQL ones by about 5:3), and it’s no surprise that by nationality, only the USA and Germany have more committers. The Japan PostgreSQL User Group (JPUG) has been around a while too and is one of the oldest and most active worldwide (and has been around long enough to have established a Japanese logo for PostgreSQL which is often used in place of Slonik [*]) . This year JPUG celebrates its 15th anniversary, and the 2014 conference – held in Tokyo on December 5th – was the largest yet with over 20 speakers, 5 (five!) parallel tracks and around 280 attendees.

Continuing a tradition started in 2012, the keynote speaker has been a foreign guest, and this year’s speaker was 2ndQuadrant’s very own Simon Riggs, who gave a well-received talk about the open source development process and PostgreSQL. (The interpreter did a fantastic job despite the occasional idomatic diversion from the script).

A key theme of this year’s conference was international exchange in general, and Asia in particular. As you can see from the schedule below, apart from the keynote speech there was an international track (rightmost column) with talks in English by speakers from a variety of Asian countries, and it’s hoped this will develop into a PgCon Asia.

jpug-conference-2014-sessions-smallAnother theme to emerge was the need for increased communication between the Japanese PostgreSQL community and the rest of the planet – Japan is home to many interesting developments which often go unnoticed in the wider world – and 2ndQuadrant hopes to play a part in improving their visibility.

Finally, no modern conference is complete without lightning talks, which this year took place during the post-conference party. By an ancient Japanese tradition dating back to the earlier part of this year, each speaker (including yours truly, who burbled on about the ideas contained in the p

[continue reading]

Posted by Dan Robinson on 2014-12-10 at 17:45:57

Did you know PostgreSQL supports indexing a subset of your table? This enables very fast reads from that subset with almost no index overhead. It’s often the best way to index your data if you want to repeatedly analyze rows that match a given WHERE clause. This makes PostgreSQL a great fit for workflows that involve pre-aggregation with additional ad hoc analysis. In this post, I’ll walk through an example query optimization in which partial indexes are well suited.

Consider a table of events with the following schema:

Each event is associated with a user and has an ID, a timestamp, and a JSON representation of the event. The JSON includes the page path, the event type (e.g. click, page view, form submission), and any other properties that describe the event.

We can use this table to store many different kinds of events, and let’s assume we have some automatic event tracking in place that logs every click, page view, and form submission so that we can analyze it later. We might want to have an internal dashboard that displays some high-value metrics, such as the number of weekly signups or the amount of revenue we are collecting per day. The events that are relevant to this dashboard make up a small fraction of this table — a very small percentage of clicks on your website are to finalize purchases! But they are mixed in with the rest of the table, so our signal-to-noise ratio is low.

We would like to index our data to make these dashboard queries fast.[1] Let’s start with a signup event, which we define to be a form submission on our /signup/ page. Getting the number of such signups in the first week of September translates to:

On a synthetic dataset with 10 million events, of which 3000 are signups, and without any indexes, this query takes 45 seconds.

Full Indexes On Single Columns: A Mixed Bag

A naive way to improve this performance is by creating single-column indexes for each of the relevant event features: (data->>'type'), (data->>'path'), and time. We can use a bitmap join between results from three i

[continue reading]

On 8th of December, Simon Riggs committed patch: REINDEX SCHEMA   Add new SCHEMA option to REINDEX and reindexdb.   Sawada Masahiko   Reviewed by Michael Paquier and Fabrízio de Royes Mello This is very simple thing, so the description will not be long. But – I'm mostly a DBA, so all things that give […]
Posted by Josh Berkus in pgExperts on 2014-12-09 at 06:09:00
One of the limitations with both RDS and Heroku is that you can only install the extensions they've approved and made available to you.  Except ... you can install "SQL-only" extensions which don't have a binary component.  Like pg_partman.

pg_partman is a set of functions and tables which help you manage your partitioned tables, including creating new partitions and triggers on the partition master.  I love it because it means I never have to write ad-hoc code for a client to do that again.  So of course I want to install it on our clients who are running on Heroku and RDS and need partitioning.  However, I have no access to "create extension partman" on the RDS command line, so how do I make it work?  Mostly by loading the extension the old way.

1) create a new schema on the target database called "partman".

2) Install PostgreSQL 9.3 on my laptop, including the development files.

3) git clone pg_partman.

4) make & make install pg_partman on my laptop.  Inside the pg_partman source directory, that will create a file in the "sql" directory called (currently): pg_partman--1.7.2.sql.

5) Since that file is designed to be installed as an extension, I need to alter it.  First, I replace all occurances of "@extschema@" with "partman".  Second, I delete the line "SELECT pg_catalog.pg_extension_config_dump('part_config', '');"

6) Load partman into your database by using "psql -f  pg_partman--1.7.2.sql YOUR_DB_NAME -h YOUR_HOST".  In the RDS case, this needs to be done as the instance superuser.

pg_partman can now be used in the target database by calling all of the functions as "partman.function", or by putting the partman schema in your search_path.

Now, this has the major drawback that upgrading pg_partman once 1.7.3 comes out will be a very manual process.  But that's better than not using it at all.

Posted by gabrielle roth on 2014-12-09 at 04:27:00
For the last PDXPUG lab of the year, we tried out BiDirectional Replicatoin (BDR). Five of us just set up VMs on our laptops and followed the instructions on the wiki. We only had about 90 minutes time for this lab, so the goal was to get a basic configuration up & running, understand the […]

PostgreSQL 9.5 has added a new mode for REINDEX with this commit:

commit: fe263d115a7dd16095b8b8f1e943aff2bb4574d2
author: Simon Riggs <>
date: Tue, 9 Dec 2014 00:28:00 +0900

Add new SCHEMA option to REINDEX and reindexdb.

Sawada Masahiko

Reviewed by Michael Paquier and Fabrízio de Royes Mello

Up to 9.4, REINDEX is able to run on different things:

  • INDEX, to reindex a given index.
  • TABLE, to reindex entirely a table, including its underlying toast index if there is a toast table on it.
  • DATABASE, to reindex all the relations of the database user is currently connected to, including catalog tables.
  • SYSTEM, to reindex all the system catalogs.

A couple of things to note though:

  • Indexes reindexed are locked with a strong exclusive lock, preventing any other session to touch it.
  • Parent tables are locked with a share lock
  • DATABASE and SYSTEM cannot run in a transaction, and process tables one-by-one, pg_class being run first as REINDEX updates it (this last point is an implementation detail, not mentioned in the docs).

The new mode for SCHEMA mixes those things, but behaves similarly to DATABASE and SYSTEM, for a schema:

  • It cannot run in a transaction.
  • Each table of the schema is processed one-by-one
  • pg_class is reindexed first only if pg_catalog is processed.

That's actually what you can find here, note first that pg_class is at the top of the relations indexed.

=# REINDEX SCHEMA pg_catalog;
NOTICE:  00000: table "pg_catalog.pg_class" was reindexed
LOCATION:  ReindexObject, indexcmds.c:1942
NOTICE:  00000: table "pg_catalog.pg_depend" was reindexed
LOCATION:  ReindexObject, indexcmds.c:1942

And that this operation is non-transactional:

=# REINDEX SCHEMA pg_catalog;
ERROR:  25001: REINDEX SCHEMA cannot run inside a transaction block
LOCATION:  PreventTransactionChain, xact.c:2976

A last thing to note is that a user that has no access on a schema will logically not be able to run REINDEX on it.


[continue reading]

On 7th of December, Simon Riggs committed patch: Event Trigger for table_rewrite   Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason.   Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. […]
Posted by Craig Ringer in 2ndQuadrant on 2014-12-08 at 03:27:44

For a couple of years now a team at 2ndQuadrant led by Andres Freund have been working on adding bi-directional asynchronous multi-master replication support for PostgreSQL. This effort has become known as the BDR project.

We’re really excited to see these efforts leading to new PostgreSQL features and have a great deal more still to come.

Incremental Development

As a large development project it is neither practical nor desirable to deliver all the changes to PostgreSQL as a single huge patch. That way lies madness and unmaintainable code. It would also be resoundingly rejected by the PostgreSQL community, as it should be.

Instead, BDR has been developed as a series of discrete changes to core PostgreSQL, plus an extension that uses those core changes to implement multi-master asynchronous logical replication.

To make this separation clearer we’ve split BDR into two separate trees since the last official release. We now manage and develop the set of patches to PostgreSQL separately from the BDR extension its self.

BDR is destined for community PostgreSQL

2ndQuadrant’s goal is to get all of BDR – the changes to the core server as well as the extension or its functionality – included in a future open source community PostgreSQL release.

When that’s achieved you’ll be able to use all the features currently provided by the BDR patches and extension with no 3rd party (open source or otherwise) patches, extensions and tools.

All of BDR is open source, released under the PostgreSQL license, and is developed on PostgreSQL’s public infrastructure. The git repository containing the authoritative current version of BDR is on

Getting BDR into users’ hands

With a project as big as BDR and with the PostgreSQL project’s safety-first, conservative and careful development style it’ll be a long time before all of BDR is included in community PostgreSQL. As a result we’ve had to make some compromises in order to get BDR functionality out to users prior to the expected end of the 9.5/9.6 release cycles in a cou

[continue reading]

Posted by Dave Page in EnterpriseDB on 2014-12-07 at 12:30:00
I’m often asked how I first became involved in PostgreSQL, and how the pgAdmin project got started. Much as I’m happy to tell the story over beer, it becomes tedious to do so over email after the first half-dozen or so times. So in a vain attempt to save my fingers from future carpal tunnel syndrome, here’s the tale…

I first discovered PostgreSQL in 1997. I was working at a social housing provider in the UK where, at the time, we were running Informix SE and Pick databases. We wanted to start building in-house web applications for managing additional areas of the business. There were no useful interfaces for Pick that could be used with any of the web technologies available to us at the time and the licensing model for Informix and its ODBC driver was prohibitively expensive, so I started looking for alternatives. 

After spending some time researching mSQL and MySQL, I eventually found PostgreSQL, which seemed to offer everything we needed—a SQL-based DBMS with ODBC and C interfaces, a liberal license, a rich set of features, and a vibrant community supporting it that was responsive to bug reports and open to feature requests. My colleagues and I developed a number of applications using PostgreSQL in the following years and the database became a critical part of our business. We had applications written in PHP for the web as well as Visual Basic for the desktop users. 

However, it was early on in our journey with Postgres that, as a Windows shop (on the desktop at least), we found ourselves wishing for a good graphical management tool for the database. At the time, the only option was pgAccess, which was a basic tool written in TCL/TK that was not easy to get to work on Windows. So I decided to write my own in Visual Basic. The first version was never released to the public and was modeled on the pgAccess design. Called pgManager, it was quickly abandoned as we found the UI to be restrictive and, well, not overly usable. The second incarnation was pgAdmin, which was redesigned from the ground up. It was relea

[continue reading]

Posted by Hubert 'depesz' Lubaczewski on 2014-12-06 at 18:47:33
The general knowledge is that numerics are slower than integers/float, but offer precision and ranges that are better. While I understand what is slow, I don't really know how much slower numerics are. So let's test it. There are couple of things to consider: table size (related to disk usage per column) speed of inserts […]

Oracle FDW is a foreign data wrapper PostgreSQL extension that allows you to read Oracle database tables from a PostgreSQL database. You can get it via the PGXN network or the main website

What is new about the latest 1.1.0 release is that there is now support for the Oracle SDO_GEOMETRY type that allows you to map the most common geometry types POINT, LINE, POLYGON, MULTIPOINT, MULTILINE and MULTIPOLYGON to PostGIS geometry type. Much of the spatial plumbing work was done by Vincent Mora of Oslandia. If we have any Windows Oracle users out there, yes there are binaries available for windows for PostgreSQL 9.1- 9.4 for both 32-bit and 64-bit. The FDW does have a dependency on the OCI.dll which I think comes shipped with Oracle products. Unfortunately, we are not Oracle database users so can't kick the tires on this one.

Posted by Josh Berkus on 2014-12-04 at 18:59:38

First, SFPUG December meeting has been scheduled. It's about pg_shard, which has been covered elsewhere.

Second, a reminder that talk submissions for pgDay SF 2015 are due Monday for regular length talks. Please submit your proposals ASAP!

Posted by Josh Berkus in pgExperts on 2014-12-04 at 18:50:00
I've been working with CitusData for a while, and I'm excited that they've finally released pg_shard, a generic "sharding" extension for PostgreSQL databases.  The press releases have all of the noise about this, so this blog post is for the nuts-and-bolts of pg_shard.


What does pg_shard do?

pg_shard takes one or more tables in your database and distributes them over multiple databases on multiple servers, or "shards".  Rows are distributed based on a hash function on the primary key which you define.  The hash function used is Postgres' own hash function for hash indexes.

A single node is the "master node" which contains templates for the distributed tables, as and transparently handles making sure that inserts, updates, and selects go to the right shards.  To ensure redundancy, sharded tables can be defined as having multiple copies across multiple shards.

SELECTs are distributed based on comparing constants in your WHERE clause and other filter conditions against the hashed key, allowing you to query against only the shards which have compliant data.  This should speed up queries for really large tables a great deal.

Where there are multiple copies of shards, pg_shard notices timeouts in contacting the shards and marks certain shards as offline. 

All of this is done inside PostgreSQL, most of it using our defined APIs and hooks.  The only thing the Citus team had to fork was ruleutils.c, and they're hoping to change that in 9.5 so forking that isn't required either.


What does pg_shard not do?

First, since the rows are distributed using a hash function, only equality comparisons are currently supported.  This means that if you query your table on ranges (such as BETWEEN two timestamps) it will scan all shards for that data.  Adding range partitioning to pg_shard is planned.

Second, unlike Postgres-XC/Postgres-XL, transactions between shards and distributed JOINs are not supported.  Those features are among the things which distinguish CitusData's proprietary product.

Third, currently only one master

[continue reading]

PostgreSQL offers many ways to define how a node reaches the end of recovery with many parameters related to recovery target, like a timestamp, a transaction ID, a target name and recovery timeline, using recovery_target_timeline. A parameter called pause_at_recovery_target that exists since 9.1 allows as well to put a standby in pause when the recovery target is reached. The upcoming release 9.5 has made some improvements in this area with the following commit:

commit: aedccb1f6fef988af1d1a25b78151f3773954b4c
author: Simon Riggs <>
date: Tue, 25 Nov 2014 20:13:30 +0000
action_at_recovery_target recovery config option

action_at_recovery_target = pause | promote | shutdown

Petr Jelinek

Reviewed by Muhammad Asif Naeem, Fujji Masao and
Simon Riggs

Its use is rather simple, when a standby has hot_standby enabled in postgresql.conf, meaning that it is able to execute read queries while being in recovery, it is possible to perform the set of actions defined above using action_at_recovery_target in recovery.conf:

  • pause, acting the same as when pause_at_recovery_target is enabled to have the standby pause any replay actions so as it is possible to check in which state it is once the recovery target has been reached. Note as well that the recovery replay can be resumed using pg_xlog_replay_resume().
  • promote, to perform automatically a promotion of the node and have it just to the next timeline, making it available for write queries as well. This is the same as when pause_at_recovery_target or action_at_recovery_target are not used, or when only pause_at_recovery_target is used and set to false.
  • shutdown, to simply shutdown the standby once target is reached. This is the real new addition that this feature brings in because this can be used to make a standby instance immediately ready for use after it has finished its recovery. Note that in this case the node will also need to re-apply all the WAL since last checkpoint, so there is some cost in this mode. Moreover, recovery.conf is not renamed

[continue reading]

Posted by Paul Ramsey on 2014-12-02 at 21:22:00

Most users of PostGIS are safely ensconsed in the world of Linux, and their build/deploy environments are pretty similar to the ones used by the developers, so any problems they might experience are quickly found and removed early in development.

Some users are on Windows, but they are our most numerous user base, so we at least test that platform preemptively before release and make sure it is as good as we can make it.

And then there's the rest. We've had a passel of FreeBSD bugs lately, and I've found myself doing Solaris builds for customers, and don't get me started on the poor buggers running AIX. One of the annoyances of trying to fix a problem for a "weird platform" user is just getting the platform setup and running in the first place.

So, having recently learned a bit about vagrant, and seeing that some of the "weird" platforms have boxes already, I thought I would whip off a couple vagrant configurations so it's easy in the future to throw up a Solaris or FreeBSD box, or even a quick Centos box for debugging purposes.

I've just been setting up my Solaris Vagrantfile and using my favourite Solaris crutch: the OpenCSW software repository. But as I use it, I'm not just adding the "things I need", I'm implicitly choosing an environment:

  • my libxml2 is from OpenCSV
  • so is my gcc, which is version 4, not version 3
  • so is my postgres

This is convenient for me, but what are the chances that it'll be the environment used by someone on Solaris having problems? They might be compiling against libraries from /usr/sfw/bin, or using the Solaris gcc-3 package, or any number of other variants. At the end of the day, when testing on such a Solaris environment, will I be testing against a real situation, or a fantasyland of my own making?

For platforms like Ubuntu (apt) or Red Hat (yum) or FreeBSD (port) where there is One True Way to get software, the difficulties are less, but even then there is no easy way to get a "standard environment", or to quickly replicate the combinations of versions a user might have run int

[continue reading]