The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Posted by Vasilis Ventirozos in OmniTI on 2015-11-24 at 15:59:00
Its been almost a year since 9.4 was released, it included many nice features, but the one that changed my day to day administration was replication slots.

In OmniTI , we use something called OmniPITR a lot, OmniPITR could be called a wal management suite, it can be used to wal ship, to replay wals to clean up wals when they are not needed and it can be used for backups. Usually the way I use it is to first set up streaming replication to a replica, and on top of streaming replication setup wal shipping replication, the reason is simple, i want to have all the wals around in case i lose the slave and i didn't really like the wal_keep_segments approach.

Backups are usually being taken from a slave in order to reduce load from the master.

Briefly, OmniPITR does this by sending a pg_start/stop_backup to the master, puts a pause removal file on the slave so wals are kept and makes 2 tar.gz files for the base backup and the wals.

Usually i have a backup server keeping all the wal files and the backups, which means a second (direct or indirect) wal shipping destination.
All this is nice and it works really well but with replication slots this could be more simple and more efficient.

Recently, I had to setup a backup for a 1 master 2 slave setup on 9.4.

The requirements of the backup would be the following:

  • backup will be taken from the slave.
  • backup will be incremental allowing PITR
  • backup will be stored remotely on another machine.
  • minimum possible wal shipping targets.
  • replicas should be identical and adding a new replica should be trivial.
  • backup should be initiated from the server that keeps backups and not from a slave.
  • backups should be tested by restoration every week.

Pretty straight forward stuff really.

I setup archiving to the backup server with a simple scp over rsync command,
archive_command = 'rsync -a %p postgres@<Backup server IP>:/data/walarchive/%f'

I created a backup user that will use .pgpass
touch ~/.pgpass ; chmod 0600 ~/.pgpass

added :

<master ip>:5432:template1:backup:pgbackup

[continue reading]

Posted by Federico Campoli on 2015-11-24 at 11:11:00
Three days to go for the next Brighton PostgreSQL meetup.
I'll run a live hangout of the talk.

You can join the event there.

The record will become available on youtube shortly after the talk's end.
PostgreSQL RDBMS has a very informative built-in tools to track the state of the database. One such tool is pg_stat_activity. This is a system view that allows to monitor the databases processes in real time. This view is comparable to the system command top, it is one of the first places from which database administrator can start an investigation if some problem occurs. In this post I explain some useful examples how this view can be used to detect abnormal activity. To start, we need just some postgresql client, psql for example.

As a first step you can use an easy query:
# SELECT * FROM pg_stat_activity;

It is not important which database you are connected to, pg_stat_activity is shared among databases.
pg_stat_activity shows single line for the each database connection. There is a dedicated UNIX process for each connection. For one connection there are several attributes:
pid - process ID which handles the connection. In the PostgreSQL terminology this process is called backend or worker.
datid, datname - ID and name of the database to which client connects.
usesysid, usename - ID and name which was used to connect.
client_addr, client_port, client_hostname - network settings of the client, network address, port and hostname.
application_name - an arbitrary name that can be specified when client connects or a session varialble.
backend_start, xact_start, query_start, state_change - timestamps, which indicate when the process (backend), transaction or query within a transaction has been started, and the last time the state of the process was changed.
state, waiting - state of the process, and a flag which indicates if the process is waiting for another process.
backend_xid, backend_xmin - running transaction ID (xid) and transaction ID (xmin) which determines visibility scope for currently running transaction.
query - text of the query which currently runs by a worker or was recently executed in this connection.

Since each line describe a single connection, we can easily find out how many client connecti

[continue reading]

We just pushed out installers for PostGIS 2.2.0 for PostgreSQL 9.5beta2 windows both 32-bit and 64-bit on Application Stackbuilder. These installers are also available as standalone listed on PostGIS windows page. This is the first PostGIS 2.2.0 release for the PostgreSQL 9.5 32-bit and a rerelease for PostgreSQL 9.5 x 64-bit (this time compiled against beta2 instead of beta1).

On quick testing the PostGIS 2.2 beta1 release and pgRouting 2.1.0 worked fine on 9.5beta2, however you may want to reinstall anyway just to be safe. You can just reinstall over your existing install, no need to uninstall first. Similarly just upgrading a PostgreSQL 9.5beta1 to 9.5beta2 seemed to not require pg_upgrade or dump/restore, so safe to just upgrade from 9.5beta1 to 9.5beta2. Other notes about this 9.5beta2 PostGIS 2.2.0 release:

  • The FDW API changed between PostgreSQL 9.5beta1 and PostgreSQL 9.5beta2, so the OGR_FDW, if you don't reinstall the bundle, will crash and burn in PostgreSQL 9.5beta2 (using PostGIS 2.2. beta1 executables). Similarly this newly compiled OGR_FDW will not work on PostgreSQL 9.5beta1 (so upgrade to 9.5beta2 first).
  • The PostgreSQL 9.5betas (that includes both beta1 and beta2), are compiled against the pointcloud 1.1 master branch. This was required because the released pointcloud 1.0.1, does not compile against PostgreSQL 9.5
  • The PostgreSQL 9.5beta2 PostGIS 2.2.0 release comes packaged with SFCGAL 1.2.2 (instead of 1.2.0 like the others versions) which fixes a crasher with ST_StraightSkeleton as noted in ticket - Newer SFCGAL will be packaged with upcoming PostGIS 2.2.1, but if you are on an older edition and are using SFCGAL, you can always copy latest SFCGAL.dll binaries from the 2.2.1dev packages on PostGIS windows page
I have just released version 4.16 of the PostgreSQL Buildfarm client

It can be downloaded from

Several bugs have been fixed, and there are these non-trivial changes:

  • capturing of TAP test logs
  • bin check enabled on Windows
  • rm_worktrees feature
  • an experimental module to run Sepgsql tests
  • try database shutdown following a start failure
  • report all PGBuild module versions to server

rm_worktrees is enabled by putting this in your config file:

rm_worktrees => 1,

The effect is that at the end of a run the checked out work tree is removed, leaving just the git repository, which in turn might be mostly linked to your HEAD branch if you use the git_use_workdirs option. Before a run, the work tree will be checked out again. The net effect is a significant saving in space used. with these two options, the additional space for each branch except when it's actually building is reduced to less than 30Mb. On crake, the git mirror, branch trees and cache now come in at about 1.5Gb. That's a lot less than it used to be. The additional cost of checking out the worktree each time is very modest.

Shutdown after a start failure tries to remedy a situation where we try to start the server, and don't detect that it has started, but it has in fact started. So now if we get a failure we try to shut down any server that might have started, just in case. This change is possibly redundant given the recent change where postgres detects that its data directory has disappeared and shuts down when it has, but it's probably worth having anyway.

Posted by Shaun M. Thomas on 2015-11-20 at 16:45:06

It has occurred to me that I may have been spending a bit too much time being excited about new Postgres features and developments in the community. One of the intents of this weekly article was for educational purposes, so this week, let’s get back to basics. To that end, the topic for this week boils down to the tools available for managing Postgres instances, and how to use them. Surprisingly, it’s not as straight-forward as you might think.

Having used Postgres for almost 15 years now, it’s easy to forget (or mentally block!) the early struggles. With new tools and wrappers always being developed, it’s not always clear what the best practices for managing a Postgres cluster actually are. Indeed, it often depends on how Postgres is installed.

Let’s start with a basic source-code installation. I won’t cover that process, but we can assume that after the smoke clears, the binaries are somewhere in /usr/bin or /usr/local/bin, and are thus in our execution path. Given that, let’s say we have an available mount point at /data and want to create a new cluster there. Here’s how that might go:

sudo mkdir /data/pgsql
sudo chown postgres:postgres /data/pgsql
sudo su - postgres
initdb -D /data/pgsql/my_db
pg_ctl -D /data/pgsql/my_db start

We now have an active Postgres instance at /data/pgsql/my_db. It’s extremely common to use the postgres OS user for this kind of thing, hence all of our sudo commands to prepare. It’s entirely possible to do this as a regular user, but I usually don’t recommend that approach.

In any case, this type of installation essentially depends on the pg_ctl command-line tool. It does everything related to controlling a Postgres instance. But it’s also annoying to use the -D parameter all the time when using Postgres tools, so there are several environment variables that can also do the job. This lets us prime our environment with .bashrc, for example. Let’s stop the instance:

export PGDATA=/data/pgsql/my_db
pg_ctl stop -m fast

Why the -m fast part? By default, Postgres is ex

[continue reading]

The Paxos algorithm is a powerful building block for building highly available distributed systems. Paxos can be seen as a function paxos(k,v) that returns the same value on all servers in a group for a certain key (k), and the value is one of the inputs (v). Paxos is most commonly used to implement log replication through a technique called Multi-Paxos. In Multi-Paxos, nodes call paxos using the indexes in a log as keys and state changes (e.g. 'set primary to node2') as values. Using this technique, data can be kept consistent and available across multiple servers, even if some of them fail.

We recently started exploring Paxos as a technique for doing automated fail-over of PostgreSQL servers and wrote a simple implementation of Paxos and Multi-Paxos including basic support for membership changes in about 1000 lines of PL/pgSQL. We found PL/pgSQL to be an excellent tool for implementing Paxos, because the necessary transactional semantics are a natural part of the language. Other consensus algorithms would have been harder to implement in this way, because they rely on timers and other background tasks. 

The pg_paxos extension demonstrates how the Paxos functions can be used for fault-tolerant, consistent table replication in PostgreSQL. While a replicated table has high write and read latencies due to the network round-trips involved, it can be very useful for applications such as automated fail-over. We plan to add optimisations such as the ones applied in Google Megastore to reduce the overhead and develop the extension further to be able to replace components like etcd or Zookeeper.

Posted by Tomas Vondra in 2ndQuadrant on 2015-11-19 at 14:30:00

While discussing results of various filesystem benchmarks - both here and in talks given on conferences, one of the things I've been questioning was the negligible impact of TRIM/DISCARD mount option.

I was speculating that maybe TRIM only really matters when the drive gets almost full, because until then the controller has enough room where to write the incoming data without getting the internal garbage collection under pressure. But I've recently did a a few pgbench runs aiming to test exactly this, using a rather large scale (filling more than 90GB of the 100GB drive), yet still no diference.

Another speculation was that maybe this particular SSD is really good and there's so much additional overprovisioning that the TRIM still makes no difference. After all, it's a good SSD frin Intel (S3700) meant for write-intensive data center applications, so I wasn't entirely surprised by that.

But then it dawned upon me ...


The TRIM does not matter because pgbench does not discard any pages, it simply overwrites the pages in place, so there's nothing to TRIM.

What pgbench does is that (a) it writes new data by adding new pages and (b) updates those pages, by rewriting them in place. None of that involves discarding pages - the controller knows which page of the file it's overwriting, and therefore knows which SSD pages are dirty and need to be erased. Similarly for WAL segments, that are reclaimed and overwritten in-place.

Maybe the TRIM does matter for other workloads (e.g. creating a lot of temporary files on the SSD device), or in databases where tables are often dropped or truncated. But for the simple pgbench workloads, TRIM does not matter at all.

Posted by Pavel Golub in MicroOLAP on 2015-11-18 at 09:48:28

I’m using this perfect tool called Sublime Text 3 for a bunch of tasks. One of them is viewing SQL scripts from time to time. ST3 has perfect SQL highlighting, but what I miss the most is the context help functionality, e.g. I select “LEFT JOIN” and hit F1 hot key. But that’s not problem since ST3 has a lot of packages. To solve my problem I need GotoDocumentation package and some tuning. Here is my settings for GotoDocumentation:

"docs": {
// obj containing the docs for each scope
// these are merged with the default ones
// the key value pair represent scope -> doc url
// supported placeholders:
// - %(query)s the selected text/word
// - %(scope)s the current scope
"sql": ""

"pascal": "",
// if we have no docs for the current scope
// we will try using the fallback one,
// to disable set to false
"fallback_scope": "google"

ST3 context help for pg sql

ST3 context help for pg sql

Filed under: Coding, PostgreSQL Tagged: development, PostgreSQL, SQL, sublime text
Posted by Marko Tiikkaja on 2015-11-18 at 02:30:00
Recently I used perf to look into what could be the cause for our increased CPU usage on the PostgreSQL server (encouraged by Andres' great talk at  I was somewhat surprised to find that thirty percent of the CPU time used by postgres was spent spinning on spinlocks, i.e. doing no actual useful work.  Digging into the profile a bit more, most of these were coming from a function called
On 11th of November, Robert Haas committed patch: Generate parallel sequential scan plans in simple cases.   Add a new flag, consider_parallel, to each RelOptInfo, indicating whether a plan for that relation could conceivably be run inside of a parallel worker. Right now, we're pretty conservative: for example, it might be possible to defer applying […]
On 30th of October, Tom Lane committed patch: Implement lookbehind constraints in our regular-expression engine.   A lookbehind constraint is like a lookahead constraint in that it consumes no text; but it checks for existence (or nonexistence) of a match *ending* at the current point in the string, rather than one *starting* at the current […]
Posted by Tomas Vondra in 2ndQuadrant on 2015-11-16 at 19:00:00

As you may be aware, PostgreSQL splits data files into 8kB pages by default. It's possible to use different page sizes (up to 32kB), but that requires compiling custom packages so almost no one does that. It's quite rare to come across an installation using different page size at a customer, for example.

Why 8kB data pages and not a different size? Clearly, smaller or larger pages might be more efficient in some cases, and indeed some other databases use different page sizes. For example InnoDB uses 16KB pages by default, and some databases even allow a mix of page sizes. So there seems to be a gain in using different page sizes, otherwise engineers working on those products would not waste time implementing it.

So what factors determine the optimal page size? And how do SSD disks change the results?

One of the fundamental rules about page size, called Five Minute Rule, was formulated by Jim Gray and G. F. Putzolu in 1985, and approaches the problem as an economic question "Given disk and RAM prices, how long should a page of given size be kept in memory?" The equation expressing the rule is essentially this

break_even_interval = (pages_per_MB * price_per_MB_RAM) * (IOPS_per_disk * price_per_disk)

and for values common in 1985 (1kB pages and disk/RAM prices in 1985), the break even interval came out as roughly 5 minutes (hence the name of the rule).

However since 1985 a lot has changed - the prices of RAM and disks plummeted, performance of rotational storage slightly improved and so on. Despite that, the 1997 revision of the paper came to about the same conclusion - the break even interval is still about 5 minutes, however assuming 8kB page size (not 1kB as used in the 1985 paper).

This is probably also the reason why PostgreSQL uses 8kB pages - Postgres95 (as it was called at that time) was released shortly before the paper was published, but it had to follow the same reasoning.

But again, a lot has changed since 1997 - the RAM/disk prices continued to plummet and, probably more importantly, flash storage is

[continue reading]

Posted by Dimitri Fontaine in 2ndQuadrant on 2015-11-16 at 13:18:00

I had the pleasure to be invited to speak at All Your Base Conference 2015 about PostgreSQL (of course). The conference gathers together lots of user experience around data management and database products, either in the now classic meaning of the word (I mean relational database management systems here) or the newer set of trade-offs represented by the NoSQL set of tools.

The conference was very nicely organized and ran smoothly, and I got the unique chance to get a back from the trenches story series about people tools of choices and the reason why they do things their way. It has been very enlightning!

If you ever get the chance to attend All Your Base, take my word for it and just go!

Posted by Vladimir Borodin on 2015-11-16 at 13:00:00

People having experience with commercial RDBMS are used to have the ability to answer the question “What a particular session is doing right now?” Or even “What was that session waiting 5 minutes ago?” For a long time PostgreSQL did not have such diagnostic tools and DBAs used to get out with different ways of sophistication. I gave a talk on (in Russian) about how we do it. This talk was collaborative with Ildus Kurbangaliev from PostgrePro. And Ildus was just speaking about tool that allows to answer questions above.

Strictly speaking it is not the first try to implement what people used to call wait [events] interface, but all previous attempts were not brought to some reasonable state and died as proof of concept patches. But pg_stat_wait is currently available as a set of patches to current stable 9.4 branch and currently developing 9.6 (actual versions should be looked at pgsql-hackers@).

After quite long testing and fixing bugs we even deployed them to production.


Before it all becomes part of core PostgreSQL you need to recompile postgres. I think description of rebuilding as ./configure && make && sudo make install is meaningless — much better to look into documentation.

After it you should add pg_stat_wait to shared_preload_libraries. Additionally, you can add following options to postgresql.conf:

  • waits_monitoring = on - enabling functionality on,
  • pg_stat_wait.history = on - storing history of wait events,
  • pg_stat_wait.history_size = 1000000 - number of last events to keep in history,
  • pg_stat_wait.history_period = 1000 - how often should wait events be stored in history (ms).

After that you should restart PostgreSQL and make CREATE EXTENSION pg_stat_wait. After that everything will start working.


What exactly will start to work? First you may look at what is inside the extension:

rpopdb01g/postgres M # \dxS+ pg_stat_wait
           Objects in extension "pg_stat_wait"
                   Object Description

[continue reading]

Andreas 'ads' Scherbaum

FOSDEM PGDay is a one day conference that will be held ahead of FOSDEM in Brussels, Belgium, on Jan 29th, 2016. This will be a one-day focused PostgreSQL event, with a single track of talks. Registration is required to attend, the registration will open soon. Since we have a limited number of seats available for this event, we urge everybody to register as soon as possible once open.

PostgreSQL Europe will also have our regular devroom at FOSDEM on Sunday the 31st, which will be held at the main FOSDEM venue at ULB. This day will, of course, continue to be free of charge and open to all FOSDEM entrants. No registration is required to attend this day.

For full details about the conference, venue and hotel, see

The call for papers is now open for both these events. We are looking for talks to fill both these days with content for both insiders and new users. Please see for details and submission information.

The deadline for submissions is December 7th, 2015, but we may as usual pre-approve some talks, so get your submissions in soon!

We will provide a special rate with the Brussels Marriott Hotel. For details, see

One small PostgreSQL 9.5 feature I worked on is the new hinting mechanism feature, which sometimes hints, based on a score, what you might have meant to type following misspelling a column name in an SQL query. The score heavily weighs levenshtein distance. A HINT message is sent to the client, which psql and other client tools will display by default.

It's common to not quite recall offhand if a column name is pluralized, or where underscores are used to break up words that make up the name of a column. This feature is targeted at that problem, providing guidance that allows the user to quickly adjust their query without mental context switching. For example:

postgres=# select * from orders where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: select * from orders where order_id = 5;
HINT:  Perhaps you meant to reference the column "orders"."orderid".

You may also see a hint in the case of two possible matches, provided both matches have the same score, and the score crosses a certain threshold of assumed usefulness:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where order_id =...
HINT:  Perhaps you meant to reference the column "o"."orderid" or the column "ol"."orderid".

If an alias was used here (which this query must have anyway), the hint becomes more specific:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where o.order_id = 5;
ERROR:  42703: column o.order_id does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where o.order_id...
HINT:  Perhaps you meant to reference the column "o"."orderid".

This feature should make writing queries interactively in psql a bit more pleasant. Mental context switching to figur

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2015-11-14 at 15:20:00
Fresh from yesterday's help file:

Say you want to insert a record into your table with all the default values set. Later on you'll update some of the values. You might want to do this so you can get the row's ID field. So you try this:
insert into mytable values () returning my_id;
but you find it doesn't work. VALUES isn't allowed to have an empty list. Here's what you do:
insert into mytable values (default) returning my_id;
This will insert the default value explicitly into first column in the table, and implicitly into all the other columns. It doesn't matter if any of the columns doesn't have an explicit default value set - in that case the default is NULL. Of course, that means this won't work if you have NOT NULL columns without a default set.
Posted by Abdul Yadi on 2015-11-14 at 05:26:49

I have gone through an interesting small size C code package for QR encoding at Its logic is contained in two small files: QR_Encode.h (3.7KB) and QR_Encode.c (61KB). Then it is sun-shiny Saturday morning as I manage to add in-memory monochrome bitmap construction (1 bit per pixel) and wrap the whole package as PostgreSQL extension module. I share it at qrcode.tar.bz2.

Please modify PG_CONFIG = /opt/pgsql/9.4/bin/pg_config in Makefile as necessary. Then make and make install.

Connect to a database then invoke sql command CREATE EXTENSION qr;

SELECT qr('QR Code with PostgreSQL', 0, 0, 4);

  1. Text to be encoded.
  2. Error correction level (0 to 3).
  3. Accepted model number (0 to 2).
  4. Scale 4 means: a dot in QR image will be 4 pixel width and 4 pixel height.

It returns byte array representing monochrome bitmap. You can save it as a file or directly render it to HTML page.
QRCode Demo

Posted by Bruce Momjian in EnterpriseDB on 2015-11-13 at 19:30:01

My daughter Catherine, then 12 years old, was interviewed at Postgres Open 2013 by Craig Kerstiens and Selena Deckelmann. Community members who have heard the 23-minute recording have found it humorous , and I have recently received permission to release it to a wider audience.

Posted by Shaun M. Thomas on 2015-11-13 at 17:49:43

A couple days ago, Robert Haas announced that he checked in the first iteration of parallel sequence scans in the Postgres 9.6 branch. And no, that’s not a typo. One of the great things about the Postgres devs is that they have a very regimented system of feature freezes to help ensure timely releases. Thus even though 9.5 just released its second beta, they’re already working on 9.6.

So what is a sequence scan, and why does this matter? Past articles have covered this, but Postgres uses sequence scans when it needs to read the entire contents of a table. For larger entities consisting of tens or hundreds of millions of rows, this is a time-consuming process. Even the best processor can only handle a few million rows per second, so as scale increases vertically, there’s currently no way to address these larger data volumes efficiently without significant effort.

As slow as a sequence scan is, it’s also a relatively simple process, and a great starting point for the long road to a parallel Postgres engine. Postgres knows how big the table files are and obviously knows its own storage format, so it merely needs to set scan ranges and farm them out to workers that report the results back to a coordinator process. In theory, that’s a clean transition that avoids complicated aggregation rules or clause pushdown headaches.

But how well does it work? To see this for myself, I did something I’ve actually never done before now: download the Postgres git tree. Then I set up a test case:

SELECT, repeat(' ', 20) AS junk
  FROM generate_series(1, 20000000) a(id);
ANALYZE test_tab;

With this test table, there are 20-million rows of empty junk and no indexes, so we’re effectively forcing Postgres to use a sequence scan for any query. Then we have to enable the feature with the max_parallel_degree parameter. And finally we invoke a query with a naive WHERE clause applied to every row so the engine has to actually do some work.

SET max_parallel_degree TO 1;

[continue reading]

We are pleased to announce the final agenda of keynote addresses for the first annual PGConf Silicon Valley conference. Keynotes will be delivered by Jay Kreps, CEO of Confluent, Ivan Novick, Greenplum Database Product Manager at Pivotal Software, and Umur Cubukcu, CEO and Co-founder of Citus Data. The conference is next week, November 17-18, 2015, at the South San Francisco Conference Center. Organized by Citus Data in cooperation with the San Francisco PostgreSQL Users Group, PGConf Silicon Valley is a technical conference aimed at the Silicon Valley PostgreSQL community and beyond.

This year's conference theme is "High Performance & Scalability". Breakout session tracks include DevOps, New Features, PostGIS, Tales from the Trenches, Hacking Postgres, and Data at Scale.

The Keynotes

On Wednesday, November 18, we will hear three keynote addresses that will launch a full day of breakout sessions:

  • Apache Kafka: A Commit Log for the Datacenter
    Jay Kreps, CEO of Confluent, will discuss what database logs, event data, stream processing and Franz Kafka have to do with each other.
  • "PostgreSQL License, Vendor Eco-systems, Cloud Providers, and Future of RDBMS
    Ivan Novick, Greenplum Database Product Manager at Pivotal Software, will explore how the permissive PostgreSQL license has created an ecosystem of vendors providing spin-off products based on the PostgreSQL database. He will cover the evolution of PostgreSQL innovations, ranging from databases for big data analytics to OLTP, as well as running them as shrink-wrapped software or in the cloud. He will also provide a vision for how these initiatives are converging to shape the future of relational databases.
  • Scaling Out PostgreSQL in the Distributed Era
    Umur Cubukcu, CEO and Co-founder of Citus Data.

Community Happy Hour

On Tuesday, November 17, following the day’s tutorial sessions, there will be a no host community happy hour at The Brass Elephant Lounge and Sports Bar. Promoted in conjunction with the San Francisco PostgreSQL Users Group (SFPUG), the lounge

[continue reading]

A client recently came to me with an ongoing mystery: A remote Postgres replica needed replaced, but repeatedly failed to run pg_basebackup. It would stop part way through every time, reporting something along the lines of:

pg_basebackup: could not read COPY data: SSL error: decryption failed or bad record mac

The first hunch we had was to turn off SSL renegotiation, as that isn't supported in some OpenSSL versions. By default it renegotiates keys after 512MB of traffic, and setting ssl_renegotiation_limit to 0 in postgresql.conf disables it. That helped pg_basebackup get much further along, but they were still seeing the process bail out before completion.

The client's Chef has a strange habit of removing my ssh key from the database master, so while that was being fixed I connected in and took a look at the replica. Two pg_basebackup runs later, a pattern started to emerge:
$ du -s 9.2/data.test*
67097452        9.2/data.test
67097428        9.2/data.test2
While also being a nearly identical size, those numbers are also suspiciously close to 64GB. I like round numbers, when a problem happens close to one that's often a pretty good tell of some boundary or limit. On a hunch that it wasn't a coincidence I checked around for any similar references and found a recent openssl package bug report:

RHEL 6, check. SSL connection, check. Failure at 64 GiB, check. And lastly, a connection with psql confirmed AES-GCM:
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)

Once the Postgres service could be restarted to load in the updated OpenSSL library, the base backup process completed without issue.

Remember, keep those packages updated!
Posted by Josh Berkus in pgExperts on 2015-11-13 at 04:45:00
First, in case you somehow missed it, PostgreSQL 9.5 Beta 2 is now out.  Time for another round of testing!  There's fewer and fewer bugs found, so we're not far from a final release.  I don't know about anyone else, but we're going into production on Beta 2 in a couple places.  Can't wait any longer for Upsert and RLS.

Second, pgConfSV has announced its keynotes, from CitusData, Pivotal, and -- as a bit of a surprise -- from streaming data thing Kafka (before you ask, I didn't pick the keynotes). I believe registration is still open, so you can still go to the conference next week if you act quickly.

Thirdly, I have a roundup of some cool blogs covering PostgreSQL which aren't on Planet and you may have missed: did a terrific two-part article on why PostgreSQL is the best open source database.  Read it here: Part I  Part II

They also covered using JSON in Postgres with Python.

In stranger news, there's an amazingly strange story about Soylent, PostgreSQL, and auto-responding to poisoning accusations.  Yes, Soylent Inc. uses PostgreSQL, why not?  Read the whole weird tale here on Zapier's blog.

That's it for my round up ... if you have some good links, post them in the comments.

The Tipping Point

In PostgreSQL 9.1 and earlier, benchmarks that I and others did all showed that the optimal number of active database connections was usually somewhere around ((2 * core_count) + effective_spindle_count).  Above this number, both throughput and latency got worse.  In every version since then the tipping point has moved, but the effect is still present at some point.  In graphs you often see this visually with Transactions Per Second up the y axis and Concurrency (i.e., number of active connections) across the x axis, with a steep climb followed by a "knee" and a performance drop-off.  The good news is that every major release for a while has moved the knee to the right and decreased the slope past the knee -- but the knee is still there.

Users and Database Connections are Different Things

Sometimes people will say "I want to support 2000 users, with fast response time."  It is pretty much guaranteed that if you try to do that with 2000 actual database connections, performance will be horrible.  If you have a machine with a lot of cores and the active data set is fully cached, you will see much better performance for those 2000 users by funnelling the requests through a small number database connections -- depending on the PostgreSQL version it may be anywhere from 2 to maybe 10 times the number of cores.

To understand why that is true, this thought experiment should help.  Consider a hypothetical database server machine with only one resource to share -- a single core.  This core will time-slice equally among all concurrent requests with no overhead.  Let's say 100 requests all come in at the same moment, each of which needs one second of CPU time.  The core works on all of them, time-slicing among them until they all finish 100 seconds later.  Now consider what happens if you put a connection pool in front which will accept 100 client connections but make only one request at a time to the database server, putting any requests which arrive while the connection is busy into a queue.  Now when

[continue reading]

Posted by Robert Haas in EnterpriseDB on 2015-11-11 at 15:53:00
I previously suggested that we might be able to get parallel sequential scan committed to PostgreSQL 9.5.  That did not happen.  However, I'm pleased to report that I've just committed the first version of parallel sequential scan to PostgreSQL's master branch, with a view toward having it included in the upcoming PostgreSQL 9.6 release.

Read more »
November 27th at 19.00 GMT I'll talk at the Brighton PostgreSQL meetup.

This time the group chosen the streaming replication as topic.

The talk will cover the PostgreSQL write ahead logging and the crash recovery process. The audience will learn how to setup a standby server using the streaming replication and how to troubleshoot it.

Please RSVP here.

Posted by Jeff Frost in pgExperts on 2015-11-11 at 06:31:00
If you're like me, you use wal-e for PITR based backups on many of the PostgreSQL servers you manage.  While setting it up, you likely discovered the --terse option greatly reduces the log output.  However, if you run the backup-push job in cron and have email aliases set up properly, you probably receive emails like this every time the backup-push runs:

NOTICE:  pg_stop_backup complete, all required WAL segments have been archived

Why is that?  Well, the default value for client_min_messages in PostgreSQL is NOTICE, which means the client (wal-e in this case) will be sent messsages of NOTICE and higher priority.

So, how to change that?  You may have gotten this far and realized there isn't a handy command line switch that wal-e exposes.  That's ok though, wal-e uses libpq, so we just need to use the PGOPTIONS environment variable.

If you're using wal-e with envdir like in the documentation examples, then you can just plunk down a PGOPTIONS file in /etc/wal-e.d/env that looks like this:

-c client_min_messages=WARNING

and that's it. If you're using a wrapper script for wal-e instead of envdir, you just need to export that environment variable before calling wal-e like this:

export PGOPTIONS="-c client_min_messages=WARNING"

Now sit back and enjoy a few less cron emails in the morning.

The following commit has popped up not so long ago in the development branch of Postgres 9.6:

commit: d89494166351e1fdac77d87c6af500401deb2422
author: Tom Lane <>
date: Tue, 3 Nov 2015 18:42:18 -0500
Allow postgres_fdw to ship extension funcs/operators for remote execution.

The user can whitelist specified extension(s) in the foreign server's
options, whereupon we will treat immutable functions and operators of those
extensions as candidates to be sent for remote execution.

Whitelisting an extension in this way basically promises that the extension
exists on the remote server and behaves compatibly with the local instance.
We have no way to prove that formally, so we have to rely on the user to
get it right.  But this seems like something that people can usually get
right in practice.

Up to 9.5, postgres_fdw evaluates the shippability, or the possibility to execute safely a given operator or function on a remote server instead of a local using roughly three factors:

  • Is the object involved a built-in object?
  • Does this object use a safe collation? Verty roughly the collation needs to match DEFAULT_COLLATION_OID.
  • Does this object contain mutable functions? Basically anything that is not immutable (look at contain_mutable_functions in src/backend/optimizer/util/clauses.c to get a better idea)

If all conditions are satisfied, this function or operator is thought as safe for remote execution, providing a huge performance gain particularly if a query works on many tuples at once. If any of those conditions is not, the function or operator will be executed locally after fetching all tuples from the remote source.

The commit above, that will be present in Postgres 9.6 in the extension postgres_fdw, brings an improvement in this area by leveraging the first condition. In short, an object does not need anymore to be a built-in one to be considered as shippable for remote execution: if an operator or function, which is immutable, is part of a white-listed extension it will be flagged as shippable wit

[continue reading]