The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at

pgBackRest is one of the most powerful backup solutions available for PostgreSQL. It has enterprise level features like compression, multiple channels (threads) of backup execution, incremental and differential backups etc.
The official documentation is Debian-centric in its focus. I wanted to test it out with the PostgreSQL sandbox from the BigSQL project.

Setting up PostgreSQL Sandbox and Installing pgBackRest

The BigSQL Project makes it easy to install and setup PostgreSQL and its associated components across different operating systems. In this document, we are going to look at how to set it up on Ubuntu 14.04. Linux binaries of the sandbox can be downloaded from the BigSQL download page

The sandbox installation requires only the unpacking of the downloaded file.

tar -xvf bigsql-9.5.3-5-linux64.tar.bz2
cd bigsql/

Using the command line utility (pgc) supplied with the sandbox, its very easy to initialize and start a PostgreSQL instance.

./pgc init pg95
./pgc start pg95

A PostgreSQL instance should now be up and running.
The same pgc utility can be used to install pgBackRest.

./pgc install backrest

Install Perl Dependencies

An important aspect to keep in mind is that pgBackrest is written in Perl and has many dependencies on different perl libraries and modules.
An easy way to install all the dependencies in one shot is to instruct the apt-get utility to install one of the leaf components in the dependency chain.

sudo apt-get install libdbd-pg-perl

This command should fetch all the perl dependencies of pgBackRest.

Setting Up a Backup Repository Directory

Set up a backup repository directory for pgBackRest with the following commands.

sudo mkdir /var/log/pgbackrest
sudo chmod 750 /var/log/pgbackrest

IMPORTANT for this test:

  1. pgbackrest and the postgres server process should run as the same OS user.
  2. The backup repository directory should be owned by the same OS user.

Change the ownership of the repository directory to the user under which the postgres process is running. If the user is “postgres” and t

Posted by Tatsuo Ishii in SRA OSS, Inc. on 2016-07-26 at 02:10
We now have new  Pgpool-II mascot logo!

This image was created by a professional designer and donated by SRA OSS, Inc. Japan.

You can find a few variations of the image at the Pgpool-II's official site.
The images are licensed under CCL. So you are allowed to have those images attached to your site if you like. Even you could customize or arrange the images!

Your client session to Pgpool-II will be disconnected once fail over or switch over happens. Pretty annoying. This is because Pgpool-II kills all child process that are responsible for each client session. Pgpool-II 3.6 will mitigate this under certain conditions:

  1. Pgpool-II operates in streaming replication mode
  2. The failed DB node is not the primary (master) node
  3. Your "load balance node" is not the failed node
1 & 2 are easy to understand. What about #3?

The load balance node is chosen when you connect to Pgpool-II. Pgpool-II assigns one of the DB nodes to send read only  queries to. It is decided according some of pgpool.conf settings:

  •  load_balance_mode (of course this should be "on")
  • "weight" parameter
  • database_redirect_preference_list
  • app_name_redirect_preference_list

The decision which DB node to choose is done at the early stage of session connection and the assignment will not be changed until you exit the session. From Pgpool-II 3.6, you can check your load balance node by using "show pool_nodes" command.

$  psql -p 11000 test
test=# show pool_nodes;
  node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node
 0       | /tmp     | 11002 | 2      | 0.333333  | primary | 0          | false
 1       | /tmp     | 11003 | 2      | 0.333333  | standby | 0          | false
 2       | /tmp     | 11004 | 2      | 0.333333  | standby | 0          | true
(3 rows)

Here "load_balance_node" is the DB node chosen for the "load balance node".

If other than node 2 is going down and the node is not primary, this session will not be disconnected. In this case the session will not be disconnected if node 1 goes down. Let's try that using another terminal:

$ pg_ctl -D data1 -m f stop
waiting for server to shut down.... done
server stopped

Ok, let's input something in the previous psql session:

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
Posted by Ian Barwick in 2ndQuadrant on 2016-07-25 at 01:20

The recently released repmgr 3.1.4 update incorporates several changes which improve usability and lay out the groundwork for enhanced compatibility with 2ndQuadrant’s barman product.

New configuration option restore_command

It’s now possible to specify a restore_command in repmgr.conf, which will be included in the recovery.conf file generated by repmgr standby clone, making it easier to configure a more robust replication setup by enabling PostgreSQL to fall back to a WAL archive source if streaming replication is interrupted. See Gabriele Bartolini’s recent blog post “Speed up getting WAL files from Barman” for an example on how to do this.

CSV output for repmgr cluster show

The repmgr cluster show command now accepts the optional parameter --csv, which outputs the status of the replication cluster in CSV format, suitable for parsing by scripts. Current “human-readble” output will look something like this:

    $ repmgr -f  /path/to/repmgr.conf cluster show

    Role      | Name  | Upstream | Connection String
      FAILED  | node1 |          | host=localhost dbname=repmgr user=repmgr port=5501
    * master  | node2 |          | host=localhost dbname=repmgr user=repmgr port=5502
      standby | node3 | node2    | host=localhost dbname=repmgr user=repmgr port=5503

The equivalent CSV output is this:

   $ repmgr -f  /path/to/repmgr.conf cluster show

with node ID as the first column, and status (0 = master, 1 = standby, -1 = failed) as
the second.

conninfo strings now accepted

repmgr operations which require the provision of database connection information, such as repmgr standby clone, now accept conninfo-style strings. This brings repmgr into line with other PostgreSQL utilities and is more than a convenient syntax – it provides much more granular control over the primary_conninfo string in the recovery.conf file generated by repmgr. This is particularly relevant when setting up SSL-based replication.

Posted by Shaun M. Thomas on 2016-07-22 at 17:55

The pure, unadulterated, presumptuous impudence of NoSQL. Engines like MongoDB recklessly discard concepts like ACID in some futile quest to achieve “web scale”, and end up accomplishing neither. RDBMS systems have literally decades of history to draw upon, and have long since conquered the pitfalls NoSQL platforms are just now encountering. There may be something to a couple of them, but by and large, they’re nothing we really need.

At least, that’s something I might have said a couple of weeks ago.

Now, I’ve never really ascribed to the supposition that NoSQL is the strict antithesis of traditional SQL-driven systems. The intro paragraph is something of a hyperbolic exaggeration of the general disdain that seems to permeate the NoSQL vs SQL debate. Most DBAs I’ve met basically say “meh” and move on with their day, optimizing queries, tweaking storage metrics, and what not. Usually NoSQL is on our RADAR, but we have other stuff to worry about.

Last week, I was forced to reconsider due to an existing internal project that needed my input. That analysis really showed me why devs and data guys from all walks of life are seduced to the Dark Side of data. After another week slogging through documentation, experimenting with self-healing replication sets, self-balancing sharding, and taking copious notes on everything, something finally clicked. The Dark Side, it seems, has some merit beyond simplifying data structure.


So what can Postgres potentially learn from its eternal adversaries? Beyond stealing their encoding mechanisms via JSON and JSONB datatypes, that is. Though my analysis thus far has focused mostly on MongoDB, it alone incorporates several fundamental concepts that I’ve always felt Postgres lacked. For now, I’ll focus on the two that have—in my opinion, anyway—really prevented Postgres from reaching its full potential.

Break Me

Postgres has replication, but no meaningful application beyond having a hot standby or a few read slaves. Why no master election consensus system like RAFT? Last year, I came ac

We've just released a new version of PGHoard, the PostgreSQL cloud backup tool we initially developed for Aiven and later open sourced.

Version 1.4.0 comes with the following new features:
  • Support for PostgreSQL 9.6 beta3
  • Support for backing up multiple tablespaces
  • Support for StatsD and DataDog metrics collection
  • Basebackup restoration now shows download progress
  • Experimental new WAL streaming mode walreceiver, which reads the write-ahead log data directly from the PostgreSQL server using the streaming replication protocol
  • New status API in the internal REST HTTP server
Please see our previous blog post about PGHoard for more information about the tool and a guide for deploying it.

Backing up multiple tablespaces

This is the first version of PGHoard capable of backing up multiple tablespaces. Multiple tablespaces require using the new local-tar backup option for reading files directly from the disk instead of streaming them using pg_basebackup as pg_basebackup doesn't currently allow streaming multiple tablespaces without writing them to the local filesystem.

The current version of PGHoard can utilize the local-tar backup mode only on a PG master server, PostgreSQL versions prior to 9.6 don't allow users to run the necessary control commands on a standby server without using the pgespresso extension. pgespresso also required fixes which we contributed to support multiple tablespaces - once a fixed version has been released we'll add support for it to PGHoard.

The next version of PGHoard, due out by the time of PostgreSQL 9.6 final release, will support local-tar backups from standby servers, natively when running 9.6 and using the pgespresso extension when running older versions with the latest version of the extension.

A future version of PGHoard will support backing up and restoring PostgreSQL basebackups in parallel mode when using the local-tar mode.  This will greatly reduce the time required for setting up a new standby server or restoring a system from backups.

Streaming replication support

This v[...]
Posted by Joshua Drake in CommandPrompt on 2016-07-21 at 19:15

Once upon a time FOSS was about Freedom. It was about exposing equality within source code. It allowed everyone equal rights and equal access to the technology they were using. An idea that if you were capable, you could fix code or pay someone to fix code. An ideology that there was something greater than yourself and that there was an inherent right built into what it is to be human with software.

Leaders to lemmings

I sat in a bar slowly nursing beers with other community members over a period of hours. We spoke of many things. We spoke of the never-done new PostgreSQL website. We spoke of my distaste for Amazon Web Services since reformed, with the exception of S3. We spoke of life. We argued, we had honest discourse and that is excellent. There was nobody complaining of political correctness. There was nobody claiming to be “offended”. There was nobody leaving because their feelings were hurt. There was a community member who passed out in his chair and dropped his phone. We walked him to his room to make sure he was safe. All was good.

This retrospective has been digging around in my grey matter since that night six months ago. Originally this was going to just be the stuff of legendary and exaggerated stories among community members that are only getting older and a few who are young but will get there someday. That is, until it began to itch, and as with any good community member, I am scratching that itch.

“My time is precious to me”

It seems like a harmless thing to say. Of course your time is precious to you. I would say that is probably true of most people. I know that my time is precious to me. I make it a point of working part time from May - September so I can take time for my family. (Don’t worry, I more than make up for it the rest of the year).

The problem with the statement is the context. The statement came from a well known contributor and a very smart guy. The reference was in relation to why someone would use software as a service and the general idea was: Software as a Service is awesome b


We’re having our 10th Anniversary meeting tonight!

As in previous years, I generated speaker & topic tag clouds:

speaker_cloud topic_cloud
When: 6-8:30pm Thursday July 21, 2016
Where: iovation
What: 10th Anniversary Social
This meeting is 100% social and will take the place of our regular July meeting.  iovation will provide a light dinner and beverages.  We will provide the conversation and reminiscing.  There will not be any long speeches, but there will be cupcakes.
iovation is on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak).  It’s right on the Green & Yellow Max lines.  Underground bike parking is available in the parking garage;  outdoors all around the block in the usual spots.  No bikes in the office, sorry!  Elevators open at 5:45 and building security closes access to the floor at 6:30.
See you there!

PostgreSQL is an awesome project and it evolves at an amazing rate. We’ll focus on evolution of fault tolerance capabilities in PostgreSQL throughout its versions with a series of blog posts. This is the third post of the series and we’ll talk about timeline issues and their effects on fault tolerance and dependability of PostgreSQL.

If you would like to witness the evolution progress from the beginning, please check the first two blog posts of the series:

  1. Evolution of Fault Tolerance in PostgreSQL 
  2. Evolution of Fault Tolerance in PostgreSQL: Replication Phase 



The ability to restore the database to a previous point in time creates some complexities which we’ll cover some of the cases by explaining failover (Fig. 1), switchover (Fig. 2) and pg_rewind (Fig. 3) cases later in this topic.

For example, in the original history of the database, suppose you dropped a critical table at 5:15PM on Tuesday evening, but didn’t realise your mistake until Wednesday noon. Unfazed, you get out your backup, restore to the point-in-time 5:14PM Tuesday evening, and are up and running. In this history of the database universe, you never dropped the table. But suppose you later realize this wasn’t such a great idea, and would like to return to sometime Wednesday morning in the original history. You won’t be able to if, while your database was up-and-running, it overwrote some of the WAL segment files that led up to the time you now wish you could get back to.

Thus, to avoid this, you need to distinguish the series of WAL records generated after you’ve done a point-in-time recovery from those that were generated in the original database history.

To deal with this problem, PostgreSQL has a notion of timelines. Whenever an archive recovery completes, a new timeline is created to identify the series of WAL records generated after that recovery. The timeline ID number is part of WAL segment file names so a new timeline does not overwrite the WAL data generated by previous timelines. It is in fact possible to archive many diff

Posted by Peter Eisentraut in 2ndQuadrant on 2016-07-20 at 12:00

When you are coding in a source tree as big as PostgreSQL’s, you will at some point want to look into some kind of source code indexing. It’s often convenient not to bother, since git grep is actually superfast. But when you want to find where a function is defined among all the call sites, some more intelligence is useful.

The traditional tools for this are ctags and etags, which create index files intended for use by vi and Emacs, respectively. The PostgreSQL source tree has some customized support for these in the tools src/tools/make_ctags and src/tools/make_etags. Because these tools operate on a directory level, those wrapper scripts create a single tag file (named tags or TAGS respectively) in the top-level directory and symlink it to all the other directories. This allows you to easily look for entries across the entire source tree. But it’s clearly a hack, and at least Emacs is often somewhat confused by this setup.

But there is something much better that works very similarly: GNU GLOBAL. A main difference is that GNU GLOBAL works on a project basis not on a directory basis, so you don’t need to do contortions to create and manage tags files all over your source tree. Also, GLOBAL can be used from the command line, so you don’t need to be an editor wizard to get started with it. Plus, it appears to be much faster.

The whole thing is very simple. Install the package, which is usually called global and available in most operating system distributions. To start, run

$ gtags

in the top-level directory. This creates the files GPATH, GRTAGS, and GTAGS.

Then you can use global to search for stuff, like

$ global elog

Or you can look for places a function is called:

$ global -r write_stderr

You can run global in any directory.

Or how about you want to look at the code where something is defined:

$ less -t elog

Note no file name is required. (See the manual for the required setup to make this work with less.)

Or of course use editor integration. For Emacs, there is ggtags-mode.



Starting from Barman 1.6.1, PostgreSQL standby servers can rely on an “infinite” basin of WAL files and finally pre-fetch batches of WAL files in parallel from Barman, speeding up the restoration process as well as making the disaster recovery solution more resilient as a whole.

The master, the backup and the standby

Before we start, let’s define our playground. We have our PostgreSQL primary server, called angus. A server with Barman, called barman and a third server with a reliable PostgreSQL standby, called chris – for different reasons, I had to rule out the following names bon, brian, malcolm, phil, cliff and obviously axl. ;)

angus is a high workload server and is continuously backed up on barman, while chris is a hot standby server with streaming replication from angus enabled. This is a very simple, robust and cheap business continuity cluster that you can easily create with pure open source PostgreSQL, yet capable of reaching over 99.99% uptime in a year (according to our experience with several customers at 2ndQuadrant).

What we are going to do is to instruct chris (the standby) to fetch WAL files from barman whenever streaming replication with angus is not working, as a fallback method, making the entire system more resilient and robust. Most typical examples of these problems are:

  1. temporary network failure between chris and angus;
  2. prolonged downtime for chris which causes the standby to go out of sync with angus.

For further information, please refer to the Getting WAL files from Barman with ‘get-wal’ blog article that I wrote some time ago.

Technically, we will be configuring the standby server chris to remotely fetch WAL files from barman as part of the restore_command option in the recovery.conf file. Since the release of Barman 1.6.1 we can take advantage of parallel pre-fetching of WAL files, which exploits network bandwidth and reduces recovery time of the standby.


This scenario has been tested on Linux systems only, and requires:

  • Barman >= 1.6.1 on the barman server
  • Python w
Posted by Simon Riggs in 2ndQuadrant on 2016-07-18 at 16:45

Just got back from extended trip to Database.Camp – Sunday, July 10, 2016 – at the UN in NYC – Totally Free!

Wide range of presentations from CTOs/CEOs from across the database software world, with about 200 attendees, slightly more people earlier in the day.

Very happy to report we had 3 presentations covering different aspects of PostgreSQL
* EnterpriseDB – Ed Boyajian – Building Postgres Ecosystems
* 2ndQuadrant – Simon Riggs – PostgreSQL: The Next 10 Years
* CartoDB – Javier de la Torre – Celebrating PostgreSQL and demonstrating execellent data visualizations

I don’t think any attendee would have missed the fact that 3 companies spoke about Postgres, whereas all other technologies were discussed only once. I noted that when asked “Do you use database X?”, more attendees hands went up for Postgres than any other. Also very happy to report that our talks interlocked nicely with one another, showing attendees that we can work together as a community.

Best talk of the day was Minerva Tantoco, CTO for the City of New York. Engaging, entertaining and greatly encouraging to see technology being adopted for the public good. Free Wifi in JFK would be nice… but looks like its coming across the whole of NYC, which will really change what is possible.

Well worth attending if they hold it again next year. Congrats to Eric for arranging, and making it for a good cause: Techie Youth.

TL;DR : With the kind support of PostgreSQL Europe, I am launching a new edition of the PostgreSQL Magazine in an innovative paper format. The primary goal is to publish it and send copies to PostgreSQL Conference Europe in Tallinn in November. If we’re sucessful, additional copies will be sent to other events.

I’m here to talk to you about the new direction for PG Magazine. So far, we’ve produced 2 issues of the magazine and we’ve received an entousiastic feedback from long-time community members as well as complete newcomers. I still believe that paper is a great medium to promote PostgreSQL and there’s a need for a regular printed media in the community.

However we’ve been struggling to find a stable economic model to ensure this media remains a community-driven effort. So far the balance we’ve found is that the content is produced, edited and controlled by community members, while the graphic editing, printing and shipping is done by professionnals and paid by non-profit associations (PG Europe, SPI, and others) or commercial sponsors (EDB for instance)

This model works with the current format of the magazine (36 pages) but it’s very slow. Producing a new issue requires a lot of time to gather the articles, edit the content and proof-read everything. It’s also very hard to translate. So far only the Chinese community has been able to fully translate the magazine.

Based on this considerations, here’s an attempt to make the whole process more dynamic while switching to a more appealing format.

Over the last month, I’ve worked on what I called a “folded magazine”. The basic idea is that the magazine is presented as an A5 newspaper. Then you unfold it and you get an A4 newspaper. Then you unfold it and you get an A3 newspaper. Then you unfold it and you get an A2 poster that you can take back at the office…

I used this concept for the 2016 PG Day France booklet. Here’s a video :

This format has drawbacks

a- The graphical editing needs to be done by professionnals. A P

Posted by Jan Wieck in OpenSCG on 2016-07-16 at 15:16
In the short time that I am enhancing the PL profiler ( I have been asked multiple times if it can be abused as a debugging aid. Not directly. The conversation goes something like this:

Q: When my PL code is stuck somewhere, can I turn on profiling and see where?
A: No.
Q: Will this feature be added?
A: No.

Of course would that be a useful feature. I don't argue that. And it seems to be that this is precisely how Oracle users find out where their PL/SQL code gets stuck or moves like molasses. However, the reason why I am against adding this is because a profiler, or even parts of it, should not be enabled 24x7 on a production database server for the purpose of eventually using it to extract some debugging information some day. There are other ways to get that information and I will show you one of them.

If a program is stuck somewhere, one uses a debugger to get a backtrace. This works with C programs, like the database server, provided that symbol information is available. In that backtrace (and some variables) we also find the PL/pgSQL backtrace. Having symbol information available is also useful in case of a program crash, to find out why it crashed by loading the core dump into the debugger. 

Every line of PL code, that is executed, goes through the PL/pgSQL executor function exec_stmt(). At that place we find the current PL code's line in a variable. Likewise every function execution goes through either plpgsql_exec_function() or plpgsql_exec_trigger(). In those stack frames we find the OID of the function as well as its signature (name and call argument types).

Doing this eventually several times for a deeply nested PL/pgSQL program is tedious and no fun. So here is a little script called plbacktrace ( that does all of that. It is invoked with the PID of the PostgreSQL backend and will output information like

[postgres@db1 tmp]$ ./ 13239
fn_oid=105672 lineno=5 func="life_get_cell_pl(life_board,integer,integer)"

I’ve been almost exclusively a Postgres DBA for a seemingly interminable length of time. While this is good for specializing, nobody wants to be a One-Trick Pony. And aside from the occasional bit of Python to write more advanced tools when Bash isn’t up to the job, it’s All Postgres All The Time. While few things would make me happier, it pays to branch out occasionally.

When NoSQL databases hit the scene a few years ago, I pretty much ignored them wholesale. I read about what they did and how they did it, and while intriguing, none of that seemed particularly better than what Postgres delivered. Cassandra could scale outwards, but has no real concept of NULL, and limits searches to “indexed” columns. Yet sites like Reddit prove how well it can scale when properly leveraged. MongoDB is basically a native JavaScript filesystem, which I ignored because Postgres support of JSON and JSONB effectively deprecated it before it was even on my radar. There are others of course, in CouchDB, Redis, HBase, and a plethora of alternatives.

There’s only so much time in a day though, and being as we use MongoDB for a couple of projects, it only makes sense to investigate how it really works. What can a Postgres user, who eats, sleeps, and breathes Postgres, hope to learn from a NoSQL database? I guess we can start with some simple and basic operations. Can I make a table, fill it with 1-million rows, update a few, delete a few, add an extra index for future searches, and perform a query or two?

Let’s start with the Postgres schema I use for practically everything:

CREATE TABLE sensor_log (
  location      VARCHAR NOT NULL,
  reading       BIGINT NOT NULL,
  reading_date  TIMESTAMP NOT NULL
INSERT INTO sensor_log (id, location, reading, reading_date)
SELECT, % 1000, % 100,
       CURRENT_DATE - (( * 10) || 's')::INTERVAL
  FROM generate_series(1, 1000000) s(id);
TIME: 7236.906 ms
CREATE INDEX idx_sensor_log_date
    ON sensor_log (reading_dat

If your PostgreSQL instance is running on an Ubuntu LTS system that you need to upgrade to the most recent release, say from precise to trusty – because, well, sooner or later you must – you need to consider what is going to happen to your database.

The upgrade process described in this article is similar to what you would have to do if you were upgrading from Trusty to Xenial, the newest Ubuntu LTS release.

Ubuntu attempts to make the process of upgrading to the newest distribution release easy and hassle-free. In fact, it is the case in many situations but not when there is PostgreSQL running in the system. If you just go ahead and try to run do-release-upgrade command, which is the officially recommended way to upgrade your Ubuntu LTS distribution to the newest release, you will end up seeing this error message: 


Get:72 trusty-backports/universe Translation-en [34.6 kB]
Fetched 23.3 MB in 6s (0 B/s)

Checking package manager
Reading package lists... Done
Building dependency tree
Reading state information... Done
Building data structures... Done
Calculating the changes
Calculating the changes
Could not calculate the upgrade

An unresolvable problem occurred while calculating the upgrade.

This can be caused by:

* Upgrading to a pre-release version of Ubuntu
* Running the current pre-release version of Ubuntu
* Unofficial software packages not provided by Ubuntu

If none of this applies, then please report this bug using the
command 'ubuntu-bug ubuntu-release-upgrader-core' in a terminal.

Not very helpful, is it? Well, clearly you need to troubleshoot. Where do you start?

Examine /var/log/dist-upgrade/main.log and look for ERROR messages. This is what you will see: 

2016-02-09 07:19:01,392 DEBUG blacklist expr '^postgresql-.*[0-9]\.[0-9].*' matches 'postgresql-plperl-9.3'
2016-02-09 07:19:01,393 DEBUG The package 'postgresql-plperl-9.3' is marked for removal but it's in the removal blacklist
2016-02-09 07:19:01,462 ERROR Dist-upgrade failed: 'The package 'postgresql-plperl-9.3' is marked fo
Posted by Stefan Petrea on 2016-07-15 at 07:15


Most web applications make use of databases. When deploying your app in production, you will need, among other things, monitoring and analysis/reports of log data generated by your application.

This comes in handy when troubleshooting problems, in situations involving bottlenecks in large applications, where we need to find low-performing SQL queries.


This is a simplified diagram that shows the interactions between the main components of a typical web application production environment.

The users make requests to the web application. In order to serve the requests, the web application needs information from a database, in this case, the databased being used is PostgreSQL, so it makes queries to it.

The PostgreSQL database server computes the queries, returns back data, but at the same time, it also writes to log files on disk.

The log files are being rotated on a daily basis. So for each day, there is one log file. The log files start out empty. Each time they are being used again, they're truncated and new log data will be written to them.

User 1
User 2
User 3
PostgreSQL database server
Web application
rotating logs
  • Monday
  • Tuesday
  • Wednesday
  • Thursday
  • Friday
  • Saturday
  • Sunday

PostgreSQL logging and log rotation

Most of the settings described here are documented in the 18.8 Error reporting and logging section of the PostgreSQL manual.

We're turning on the logging collector, we ensure that we're have %w in the daily log filenames (this is the 0-based weekday represented as a decimal number, with 0 being sunday, 1 is monday and so on).

Once the logging collector is enabled, the logs are no longer stored in /var/log/postgresql/ , but instead, they will be located at /var/lib/postgresql/9.4/main/pg_log/ .

We're also aiming to truncate the file on rotation.

logging_collector = on
log_filename = 'postgresql-%w.log'

PostgreSQL’s manual is generally pretty clear about the locks taken by various operations – but nothing’s perfect. If you need to, you can also ask PostgreSQL directly.

You can check lock levels trivially with psql or PgAdmin.

For example, to see what lock alter table some_table disable trigger some_trigger; takes:

test=> BEGIN;
test=> ALTER TABLE some_table DISABLE TRIGGER some_trigger;
test=> SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND relation = 'some_table'::regclass;
 locktype |         mode          
 relation | ShareRowExclusiveLock
(1 row)

That’s for a lock on a table. It’s easy to see from this that we take a SHARE ROW EXCLUSIVE lock, which according to the manual:

… protects a table against concurrent data changes, and is self-exclusive so that only one session can hold it at a time.

Other uses of pg_locks

You can also filter on pg_locks in other ways to see other locks held by a transaction. You’ll have to do some joins on pg_class etc to decode the relation OIDs to names – which is why we really need a pg_stat_locks view in PostgreSQL to make this easier.

People use a variety of canned queries of varying quality for looking to see which processes block others at the moment. With the addition of more detailed lock wait information and pg_blocking_pids() in 9.6, this will get a lot easier, though 9.6 doesn’t add a helper view yet.

What are the virtualxid and transactionid locks?

One important and possibly confusing thing you’ll see in pg_locks is that every transaction holds a special lock on its self, called the virtualxid lock:

test=> BEGIN;
test=> SELECT * FROM pg_locks WHERE pid = pg_backend_pid();
  locktype  | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |      mode       | granted | fastpath 

After doing full-time PostgreSQL consulting for over 16 years now, I actually don’t remember a time without inheritance anymore. Of course things were improved over time, but in my head it has always been there and it has always just worked as expected. After so many years I still love the feature because it offers […]

The post Inheritance – One more reason to love PostgreSQL appeared first on Cybertec - The PostgreSQL Database Company.

Constraints in Postgres are very powerful and versatile: not only are foreign keys, primary keys, and column uniqueness done internally via constraints, but you may create your own quite easily (at both the column and table level). Most of the time constraints are simply set and forget, but there is one time constraints may become a problem: copying the database using the pg_dump program.

The issue is that constraints are usually added *before* the data is copied to the new table via the COPY command. This means the constraint fires for each added row, to make sure that the row passes the conditions of the constraint. If the data is not valid, however, the COPY will fail, and you will not be able to load the output of your pg_dump into a new database. Further, there may be a non-trivial performance hit doing all that validation. Preventing the constraint from firing may provide a significant speed boost, especially for very large tables with non-trivial constraints.

Let's explore one way to work around the problem of pg_dump failing to work because some of the data is not valid according to the logic of the constraints. While it would be quicker to make some of these changes on the production system itself, corporate inertia, red tape, and the usual DBA paranoia means a better way is to modify a copy of the database instead.

For this example, we will first create a sample "production" database and give it a simple constraint. This constraint is based on a function, to both emulate a specific real-world example we came across for a client recently, and to allow us to easily create a database in which the data is invalid with regards to the constraint:

dropdb test_prod; createdb test_prod
pgbench test_prod -i -n
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.82 s, remaining 0.00 s)
set primary keys...
psql test_prod -c 'create function valid_account(int) returns bool language sql immutable as $$ SELECT $1 > 0$$;'
psql test_prod -c 'alter table pg

Since Postgres started supporting NoSQL (via hstore, json, and jsonb), the question of when to use Postgres in relational mode vs NoSQL mode has come up a lot. Do you entirely abandon traditional table structures, and go with documents all the way? Or do you intermingle both? The answer unsurprisingly is: it depends. Each newer model including hstore, JSON, and JSONB has their ideal use cases. Here we’ll dig deeper into each and see when you should consider using them.


If you exclude XML, this was the first truly unstructured datatype to arrive in Postgres. Hstore arrived way back in Postgres 8.3, before upsert, before streaming replication, and before window functions. Hstore is essentially a key/value store directly in Postgres. With hstore you’re a little more limited in terms of the datatypes you have: you essentially just get strings. You also don’t get any nesting; in short it’s a flat key/value datatype.

The upside of hstore is you don’t have to define any of your keys ahead of time. You can simply insert the record and it’ll save everything. Let’s say you’ve got an example table:

CREATE TABLE products (
  id serial PRIMARY KEY,
  name varchar,
  attributes hstore

From here you can insert whatever you want into the attributes column. And then query based on those various keys or values.

INSERT INTO products (name, attributes) VALUES (
 'Geek Love: A Novel',
 'author    => "Katherine Dunn",
  pages     => 368,
  category  => fiction'

SELECT name, attributes->'author' as author
FROM products
WHERE attributes->'category' = 'fiction'

The obvious benefit here is flexibility, but where it really shines is being able to leverage various index types. In particular, a GIN or GiST index will index every key and value within the hstore. This way when you filter on something it’ll use the index if it makes sense to the planner within Postgres.

As hstore isn’t a full document equivalent, it’s a stretch to consider using it as such. If you have relational data as well as some data that may not alway


As the PostgreSQL community continues down its path of world domination I can't help but wonder whether the various PostgreSQL companies are going to survive the changes. Once upon a time there was an undercurrent of understanding that what was good for the community was good for the company. Whatever company that may be. However, over the last few years it seems that has changed. It seems there is more prevalance toward: What is good for the company is good for the community, or in other words, "The goal is profit."

That is a flawed discipline to follow in the Open Source world. A truly beneficial, strong and diverse community has to eliminate that thought entirely. The goal is not profit; profit is the reward.

That isn't to say that profit is bad. That would be stupid. It is profit that allows Command Prompt to sponsor my activities with United States PostgreSQL and Software in the Public Interest. It is to say that my contributions to the community as a whole drive Command Prompt's profit. It is symbiotic; a constant ebb and flow of the relationship between community and commerce.


I would invite other PostgreSQL companies to consider this. I would challenge them to upend their profiteering motive and focus on community building with profit being the reward. The profit will follow. How do you do this? How do you continue to derive profit from community without sacrificing the community or your standing within the community? Here are some practical ideas:

  • If you aren't going to help, offer to find someone that will help. The answer, "Why would I help you" is never appropriate.
  • If you release something to the community, support it as the community would. Otherwise, keep it to yourself (open source or not).
  • Avoid language such as, "It's free, funding for new features or requirements is welcome." It is embarrassing and arrogant. It doesn't provide a solution and has no place on a community support list. If that is your approach contact the person directly. The community isn't your advertising platform. Instead tr
Posted by US PostgreSQL Association on 2016-07-12 at 21:02

Did you know that there is not a single major contributor to PostgreSQL that is not paid to be a contributor to the project?

There is nothing wrong with that. I would argue that it is a sign of a healthy project. However, it does come with downsides. The most important one being that no matter how autonomous your employer says that your work is, at any point your priorities are driven by what is best for the company. That means, it is driven by profit. If the work being done is not going to lend itself to the bottom line of the sponsoring company, those priorities may change.

read more

Posted by Tomas Vondra in 2ndQuadrant on 2016-07-12 at 20:25

I had the pleasure to attend PGDay UK last week – a very nice event, hopefully I’ll have the chance to come back next year. There was plenty of interesting talks, but the one that caught my attention in particular was Performace for queries with grouping by .

I have given a fair number of similar performance-oriented talks in the past, so I know how difficult it is to present benchmark results in a comprehensible and interesting way, and Alexey did a pretty good job, I think. So if you deal with data aggregation (i.e. BI, analytics, or similar workloads) I recommend going through the slides and if you get a chance to attend the talk on some other conference, I highly recommend doing so.

But there’s one point where I disagree with the talk, though. On a number of places the talk suggested that you should generally prefer HashAggregate, because sorts are slow.


I consider this a bit misleading, because an alternative to HashAggregate is GroupAggregate, not Sort. So the recommendation assumes that each GroupAggregate has a nested Sort, but that’s not quite true. GroupAggregate requires sorted input, and an explicit Sort is not the only way to do that – we also have IndexScan and IndexOnlyScan nodes, that eliminate the sort costs and keep the other benefits associated with sorted paths (especially IndexOnlyScan).

Let me demonstrate how (IndexOnlyScan+GroupAggregate) performs compared to both HashAggregate and (Sort+GroupAggregate) – the script I’ve used for the measurements is here. It builds four simple tables, each with 100M rows and different number of groups in the “branch_id” column (determining the size of the hash table). The smallest one has 10k groups

-- table with 10k groups
create table t_10000   (branch_id bigint, amount numeric);
insert into t_10000 select mod(i, 10000), random()
 from generate_series(1,100000000) s(i);

and three additional tables have 100k, 1M and 5M groups. Let’s run this simple query aggregating the data:

SELECT branch_id, SUM(amount) FROM t_10000 GROUP BY 1

and then convince t

Today, on irc, someone asked interesting question. Basically she ran a query like: SELECT a, b, c, d, e, f FROM TABLE ORDER BY a then, she processed the query to get, for each a array of unique values of b, c, d, e, and f, and then he inserted it back to database, to […]

A common example when talking about why it's a good thing to be able to do PITR (Point In Time Recovery) is the scenario where somebody or some thing (operator or buggy application) dropped a table, and we want to do a recover to right before the table was dropped, to keep as much valid data as possible.

PostgreSQL comes with nice functionality to decide exactly what point to perform a recovery to, which can be specified at millisecond granularity, or that of an individual transaction. But what if we don't know exactly when the table was dropped? (Exactly being at the level of specific transaction or at least millisecond).

On way to handle that is to "step forward" through the log one transaction at a time until the table is gone. This is obviously very time-consuming.

Assuming that DROP TABLE is not something we do very frequently in our system, we can also use the pg_xlogdump tool to help us find the exact spot to perform the recovery to, in much less time. Unfortunately, the dropping of temporary tables (implicit or explicit) is included in this, so if your application uses a lot of temporary tables this approach will not work out of the box. But for applications without them, it can save a lot of time.

Let's show an example. This assumes you have already set up the system for log archiving, you have a base backup that you have restored, and you have a log archive.

The first thing we do is try to determine the point where a DROP TABLE happened. We can do this by scanning for entries where rows have been deleted from the pg_class table, as this will always happen as part of the drop.


For a long time, one of the most known shortcomings of PostgreSQL was the ability to parallelise queries. With the release of version 9.6, this will no longer be an issue. A great job has been done on this subject, starting from the commit 80558c1, the introduction of parallel sequential scan, which we will see in the course of this article.

First, you must take note: the development of this feature has been continuous and some parameters have changed names between a commit and another. This article has been written using a checkout taken on June 17 and some features here illustrated will be present only in the version 9.6 beta2.

Compared to the 9.5 release, new parameters have been introduced inside the configuration file. These are:

  • max_parallel_workers_per_gather: the number of workers that can assist a sequential scan of a table;
  • min_parallel_relation_size: the minimum size that a relation must have for the planner to consider the use of additional workers;
  • parallel_setup_cost: the planner parameter that estimates the cost of instantiate a worker;
  • parallel_tuple_cost: the planner parameter that estimates the cost of transferring a tuple from one worker to another;
  • force_parallel_mode: parameter useful for testing, strong parallelism and also a query in which the planner would operate in other ways.

Let’s see how the additional workers can be used to speed up our queries. We create a test table with an INT field and one hundred million records:

postgres=# CREATE TABLE test (i int);
postgres=# INSERT INTO test SELECT generate_series(1,100000000);
INSERT 0 100000000
postgres=# ANALYSE test;

PostgreSQL has max_parallel_workers_per_gather set to 2 by default, for which two workers will be activated during a sequential scan.

A simple sequential scan does not present any novelties:

                                                       QUERY PLAN                         

PostgreSQL is an awesome project and it evolves at an amazing rate. We’ll focus on evolution of fault tolerance capabilities in PostgreSQL throughout its versions with a series of blog posts. This is the second post of the series and we’ll talk about replication and its importance on fault tolerance and dependability of PostgreSQL.

If you would like to witness the evolution progress from the beginning, please check the first blog post of the series: Evolution of Fault Tolerance in PostgreSQL


PostgreSQL Replication

Database replication is the term we use to describe the technology used to maintain a copy of a set of data on a remote system.  Keeping a reliable copy of a running system is one of the biggest concerns of redundancy and we all like maintainable, easy-to-use and stable copies of our data.

Let’s look at the basic architecture. Typically, individual database servers are referred to as nodes. The whole group of database servers involved in replication is known as a cluster. A database server that allows a user to make changes is known as a master or primary, or may be described as a source of changes. A database server that only allows read-only access is known as a Hot Standby, or sometimes, a slave server. (Hot Standby term is explained in detailed under Standby Modes title.)

The key aspect of replication is that data changes are captured on a master, and then transferred to other nodes. In some cases, a node may send data changes to other nodes, which is a process known as cascading or relay. Thus, the master is a sending node but not all sending nodes need to be masters. Replication is often categorized by whether more than one master node is allowed, in which case it will be known as multimaster replication.

Let’s see how PostgreSQL is handling replication over time and what is the state-of-art for fault tolerance by the terms of replication.

PostgreSQL Replication History

Historically (around year 2000-2005), Postgres only concentrated in single node fault tolerance/recovery which is mostly achieve


pg_blocking_pids is one of those things that makes the life of analysis tasks easier in Postgres. It has been introduced in 9.6 with the following commit:

commit: 52f5d578d6c29bf254e93c69043b817d4047ca67
author: Tom Lane <>
date: Mon, 22 Feb 2016 14:31:43 -0500
Create a function to reliably identify which sessions block which others.

This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether.  (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)


You can refer to the commit text in full to get more details regarding why this function is better than a join on the system catalogs pg_locks (self join with one portion being the waiter, and the other the holder, doing field-by-field comparisons), from which is a short summary:

  • Better understanding of which lock mode blocks the other.
  • When multiple sessions are queuing to wait for a lock, only the one at the head is reported.
  • With parallel queries, all the PIDs of the parallel sessions are reported. Note that it is possible in this case that duplicated PIDs are reported here because for example multiple waiters are blocked by the same PID.

Note that the primary reason for its introduction is to simplify the isolation testing facility that has been querying directly pg_locks to get information on the lock status between lock holders and waiters.

This function takes in input the PID of a session, and returns a set of PIDS taking a lock that this session whose PID is used in input is waiting for. So let’s take an example, here is a session 1:

=# CREATE TABLE tab_locked (a int);
=# SELECT pg_backend_pid()
Posted by Simon Riggs in 2ndQuadrant on 2016-07-10 at 20:28

A salesman from Oracle recently spoke to my sales colleague at a conference, teasing him that he should “come make some money”. That was quite surprising, given Oracle’s well documented problems in gaining new sales, most especially what they say in their own public filings. The reply was unsurprisingly: “No, thanks – I’d like a permanent job.”

Oracle’s rising open source problem

Oracle Sales Erode as Startups Embrace Souped-Up Free Software

Of course, its well known that squeezing existing customers is one of the things they’re doing to avoid a catastrophic slump in sales. The only thing I can add from my own experience is the level of pure anger that the current policies are generating with their current customers. More than that, avoiding database vendor lock-in for their next purchase is the number #1 requirement, so the 2ndQuadrant story around PostgreSQL sits well with soon-to-be ex-Oracle customers.