The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Posted by Shaun M. Thomas in 2ndQuadrant on 2018-04-20 at 17:00
Posted by Baron Schwartz on 2018-04-20 at 15:30

Do you know what database indexes are and how they work? Do they seem hard to understand? They don’t have to be. The basic principles you need to know are simple and easy to remember. And developers need to know the basics of indexing: your DBAs can’t save you, because discovering you need an index after you deploy a feature is often too little, too late.

Posted by Dimitri Fontaine in CitusData on 2018-04-20 at 12:47

Continuing our series of PostgreSQL Data Types today we’re going to introduce the PostgreSQL array data types.

Arrays can be used to denormalize data and avoid lookup tables. A good rule of thumb for using them that way is that you mostly use the array as a whole, even if you might at times search for elements in the array. Heavier processing is going to be more complex than a lookup table.

Posted by Michael Paquier on 2018-04-20 at 03:55

PostgreSQL 11 is releasing a small binary called pg_verify_checksums which is able to check if page-level checksums are in a sane state for a cluster which has been cleanly shutdown (this is necessary so as not not face checksum inconsistencies because of torn pages).

Thinking about it, it is not actually complicated to extend the tool in such a way that it enables and disables checksum for an offline cluster, which has been resulted in a tool that I called pg_checksums available here. This can compile with PostgreSQL 11, as it is has been integrated with the new set of routines which can be used if a data folder allows read permissions for a group. Note that it is not complicated to make that work with past versions as well and requires a couple of minutes, but I keep any code in this tree simple of any cross-version checks.

So, pg_checksums gains a couple of options compared to its parent:

  • Addition of an –action option, which can be set to “verify” to do what pg_verify_checksums does, “enable” to enable checksums on a cluster, and “disable” to do the reverse operation. “disable” costs nothing as it is just a matter of updating the control file of the cluster. “enable” costs a lot as it needs to update all the page’s checksums and then update the control file. For both things a sync of the data folder is done.
  • Addition of a –no-sync option, which disables the final fsync calls in charge of making the switch durable. This can be useful for automated testing environments.

So imagine that you have a cluster with checkums disabled, and stopped cleanly, then here is how to enable checksums:

$ pg_controldata -D $PGDATA | grep checksum
Data page checksum version:           0
$ pg_checksums --action enable -D $PGDATA
Checksum operation completed
Data checksum version: 0
Files operated:  1224
Blocks operated: 3684
Enabling checksums in cluster
$ pg_controldata -D $PGDATA | grep checksum
Data page checksum version:           1

Then let’s look at the state of the page checksummed:

$ pg_checksums --action verify 
Today I integrated readline support to pspg. What it is mean? Better comfort when search string is entered and mainly, persistent history of searched strings.

What is Slony?

Slony-I (referred to as just ‘Slony’ from here on out) is a third-party replication system for PostgreSQL that dates back to before version 8.0, making it one of the older options for replication available. It operates as a trigger-based replication method that is a ‘master to multiple slaves’ solution.

Slony operates by installing triggers on each table to be replicated, on both master and slaves, and every time the table gets an INSERT, UPDATE, or DELETE, it logs which record gets changed, and what the change is. Outside processes, called the ‘slon daemons’, connect to the databases as any other client and fetch the changes from the master, then replay them on all slave nodes subscribed to the master. In a well performing replication setup, this asynchronous replication can be expected to be anywhere 1 to 20 seconds lagged behind the master.

As of this writing, the latest version of Slony is at version 2.2.6, and supports PostgreSQL 8.3 and above. Support continues to this day with minor updates, however if a future version of PostgreSQL changes fundamental functionality of transactions, functions, triggers, or other core features, the Slony project may decide to discontinue large updates to support such drastic new approaches.

PostgreSQL’s mascot is an elephant known as ‘Slonik’, which is Russian for ‘little elephant’. Since this replication project is about many PostgreSQL databases replicating with each other, the Russian word for elephants (plural) is used: Slony.


  • Cluster: An instance of Slony replication.
  • Node: A specific PostgreSQL database as Slony replication node, which operates as either a master or slave for a replication set.
  • Replication Set: A group of tables and / or sequences to be replicated.
  • Subscribers: A subscriber is a node that is subscribed to a replication set, and receives replication events for all tables and sequences within that set from the master node.
  • Slony Daemons: The main workers that execute replication, a Slony daemon is kicked off for every node in
One of upcoming Pgpool-II 4.0's interesting features will be: "false" primary server detection in streaming replication environment.

Suppose we have 1 primary server and two standby servers connection the primary server.

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | up     | 1.000000  | standby | 0          | true              | 0
(3 rows)

What will happen if the node 2 standby server is promoted to primary?

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | up     | 1.000000  | standby | 0          | true              | 0
(3 rows)

As you can see nothing has been changed as far as show pool_nodes command goes.
But: actually node 2 is not the standby connected to the primary any more. So if large updates are sent to the primary, the node 2 is far behind the primary server since data is not replicated to the node any more.

t-ishii@localhost: pgbench -i -p 11000 test
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.19 s, remaining 0.
Posted by Dimitri Fontaine in CitusData on 2018-04-18 at 11:41

Continuing our series of PostgreSQL Data Types today we’re going to introduce the PostgreSQL ranges data type.

Range types are a unique feature of PostgreSQL, managing two dimensions of data in a single column, and allowing advanced processing. The main example is the daterange data type, which stores as a single value a lower and an upper bound of the range as a single value. This allows PostgreSQL to implement a concurrent safe check against overlapping ranges, as we’re going to see in this article.

Pgpool is less actual today, than it used to be 10 years ago, when it was the default part of a production PostgreSQL set up. Often when somebody was talking about PostgreSQL cluster, they were referring to postgreSQL behind pgpool and not to the PostgreSQL instance itself (which is the right term). Pgpool is recognised between most influential Postgres players: postgresql community, commandprompt, 2ndquadrant, EDB, citusdata, postgrespro (ordered by age, not influence). I realize the level of recognition in my links is very different - I just want to emphasize the overall impact of pgpool in the postgres world. Some of the most known current postgres “vendors” were found after the pgpool was already famous. So what makes it so famous?

Just the list of most in-demand offered features makes it look great:

  • native replication
  • connection pooling
  • load balancing for read scalability
  • high availability (watchdog with virtual IP, online recovery & failover)

Well, let’s make a sandbox and play. My sample setup is master slave mode. I would assume it is the most popular today, because you typically use streaming replication together with load balancing. Replication mode is barely used these days. Most DBAs skip it in favour to streaming replication and pglogical, and previously to slony.

The replication mode has many interesting settings and surely interesting functionality. But most DBAs have master/multi slave setup by the time they get to pgpool. So they are looking for automatic failover and load balancer, and pgpool offers it out of the box for existing master/multi slave environments. Not to mention that as from Postgres 9.4, streaming replication works with no major bugs and from 10 hash indexes replication is supported, so there are barely anything to stop you from using it. Also streaming replication is asynchronous by default (configurable to synchronous and even not

On 28th of March 2018, Peter Eisentraut committed patch: Transforms for jsonb to PL/Python Add a new contrib module jsonb_plpython that provide a transform between jsonb and PL/Python. jsonb values are converted to appropriate Python types such as dicts and lists, and vice versa. Author: Anthony Bykov and then, on 3rd of April 2018, he […]

What’s common between DBA and detective? They both solve murder and mystery while trying to make sense of the nonsense.

Recently, one of our clients had Postgres crash due to the lack of free space on the database server. Client's reaction was instantaneous - a hot standby was immediately promoted to master and the application continued working. Some details about the crash were saved and since it’s the age of clouds, devops and microservices, the failed host was recycled into a new hot standby.

Once the dust has settled, the client gave us saved information about the failure asking: What was it? How can I avoid it in the future?

The first thing that we’ve done is what any good detective would do - started looking for clues. Client provided us with crash logs and said that pg_xlog directory was suspiciously big while there were only around 2 thousand files with 16MB size each. We looked into postgresql.log to see what were the last lines before crash:

PANIC: could not write to file "pg_xlog/xlogtemp.9023": No space left on device
LOG: WAL writer process (PID 9023) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

Looking at other log messages we saw another FATAL record:

FATAL: archive command failed with exit code 131
DETAIL: The failed archive command was: /opt/utils/pgdb/wal-archiver pg_xlog/000000010000000000000003

That was the answer to the first question. From these two log messages, we concluded that free space was completely exhausted due to failed archive command - Postgres could not write its transaction log and ultimately crashed.

But how we can prevent that from happening again and what to do in case it does happen?
Let me take a little detour here, for those who are unfamiliar with pg_xlog. pg_xlog is a directory where Postgres keeps its transaction log (also known as XLOG or WAL) which is used for recovery purposes. For various reasons, people often delete files from this directory because its name contains word "log" and one inclined to think "This are ju
Posted by William Ivanski in 2ndQuadrant on 2018-04-17 at 11:56

Introduced in OmniDB 2.6.0, the new OmniDB Console Tab provides an easy and comfortable way to interact with your databases. Users familiar with the psql command line utility will find that Console Tab behaves very similarly. In fact, many of the backslash commands Console Tab provides are present in psql.

For example, \? shows a list with all commands available, its syntax and description. The command \h can be another friend of yours, because it shows a comprehensive help about any PostgreSQL SQL command.

The editor on the bottom of the tab area is full-featured just like the Query Tab editor (it provides syntax highlight and autocomplete with Ctrl-Space). To execute a command, just type it in this editor. If the command is a backslash (starts with \), just type Enter and it will be executed. If it is a regular SQL command, then it can be multi-lined, and you will need to type Ctrl-Q to execute it.

All commands and its output will be logged into the display area, which is textual and read-only, so you can copy and paste its contents somewhere else. You can clear the display area by clicking on the Clear Console button.

All commands also are logged in the global OmniDB query history, and also in a local console history, which you can by clicking in the Command History button.

By clicking in the green check, you can borrow the command and put it into the editor, so you can amend it and execute it. Another comfortable way to navigate through the history is using the shortcuts Ctrl-Up and Ctrl-Down, to quickly paste in the editor the previous and next commands, respectively.

Backslash commands such as \dt, \d+, \x and \timing are very useful when dealing with databases every day. The console tab will also show any errors and the results of any SQL command you type in a pretty way. Try it out!

In the near future we will add more commands such as \copy and also the ability for the user to create custom commands. Stay tuned!

Posted by Luca Ferrari on 2018-04-17 at 00:00

Adding a column with a default value requires a full table rewrite, and therefore it is often suggested to avoid the default value. However, it is possible to add the default value without having PostgreSQL perform the full table rewrite.

Adding a column via ALTER TABLE ADD COLUMN

In order to demonstrate what does PostgreSQL when a new column is added, consider the following simple table to begin with:

> CREATE TABLE foo( i int ); > INSERT INTO foo(i) SELECT v FROM generate_series(1, 100000) v; 

Suppose it is required to add a column, with a default value, to the table. When an ALTER TABLE ADD COLUMN is issued, and a default value is provided, PostgreSQL performs a full update of the whole table, that is all the 100k tuples are updated immediatly:

 > ALTER TABLE foo ADD COLUMN c char(1) DEFAULT 'A'; ALTER TABLE Time: 180.997 ms > SELECT distinct( c ) FROM foo; c --- A (1 row) 

As readers can see, the c column has been added and all the tuples have been updated to the default value.

When the number of tuples is really high, performing such ADD COLUMN will result in a very huge database activity. Therefore, it is often suggested to perform the ADD COLUMN without a default value, so to get the...

Posted by Viorel Tabara in Severalnines on 2018-04-16 at 18:03

PostgreSQL has the reputation of being rock solid from its beginnings, and over the years has accumulated a set of impressive features. However the peace of mind that your on-disk data is ACID compliant — if not complemented by an equivalent well thought backup strategy — can be easily shattered.

Backup Types

Before diving into the available tools, let’s look at the available PostgreSQL backup types and what their characteristics are:

SQL dumps (or logical)

  • Does not block readers or writers.
  • Geared towards small sets of data because of the negative impact on system load and the long time required for both backup and restore operations. The performance may be increased with the –no-sync flag, but refer to the man page for the risks associated with disabling the wait for writes.
  • A post-restore ANALYZE is required in order to optimize the statistics.
  • Global objects such as roles and tablespaces can only be backed up using pg_dumpall utility. Note that tablespace directories must be manually created prior to starting the restore.
  • Supports parallelism at the expense of increased system load. Read man pg_dump for its caveats and special requirements e.g. synchronized snapshots.
  • Dumps can be loaded in newer versions of PostgreSQL, or even another machine architecture, however they are not guaranteed to be backwards compatible between major versions so some manual editing of the dump file may be required.

Filesystem (or physical)

  • Requires the database to be shut down.
  • Faster than logical backups.
  • Includes cluster data.
  • Can only be restored on the same major version of PostgreSQL.

Continuous archiving (or Point In Time Recovery or PITR)

  • Suitable for very large databases where logical or physical backups would take too long.
  • Some directories inside the data directory can be excluded to speed up the process.



Continuing our series of PostgreSQL Data Types today we’re going to introduce network address types.

PostgreSQL includes support for both cidr, inet, and macaddr data types. Again, those types are bundled with indexing support and advanced functions and operator support.

Table of Contents Introduction SQL features TABLESAMPLE (9.5) GROUPING SETS (9.5) Foreign table inheritance (9.5) Parallelism Sequential Scan (9.6) Index Scan (10) Joins (9.6, 10, 11) Aggregation (9.6) Union of sets (11) Access methods BRIN Indexes (9.5) BLOOM filters (9.6) Partitioning Forms of partitioning (10, 11) Indexes (11) Partition exclusion (11) Joins and aggregates (11) Internal improvements Hashing functions (10) Executor improvements (10) Sorts improvements Abbreviated keys (9.
It will be 15 years in coming July since Pgpool was born. At that time Pgpool was just a simple connection pooler program, no replication, no load balancing. Many contributors have been working hard on enhancing Pgpool since then.

Pgpool-II developers have decided last week that upcoming Pgpool-II's  major version will be 4.0, rather than 3.8 to celebrate the anniversary. Of course that's not the only reason for the version number. As usual Pgpool-II 4.0 will have exciting new features including enhanced authentication mechanism and more. Stay tuned.
Posted by Jan Karremans in EnterpriseDB on 2018-04-15 at 18:12

One of the attractive things of my job is this… Just a bit more often than every now and then, you get the opportunity to get out and meet people to talk about Postgres. I don’t mean the kind of talk I do every day, which has more of a commercial touch to it. – Don’t get me wrong, that is very important too! – But I mean, really talk about PostgreSQL, be part of the community and help spread the understanding of what open source database technology can do for companies. Running implementations, either small or large, trivial or mission critical…

This past week was one of those weeks.

I got to travel through Germany together with Mr. Bruce Momjian himself. Bruce is the one of the most established and senior community leaders for Postgres. Bruce is also my colleague and I would like to think I may consider him my friend. My employer, EnterpriseDB, gives us the opportunity to do this. To be an integral part of the PostgreSQL community, contribute, help expand the fame of Postgres, no strings attached. Support the success of the 30 to 40,000 engineers creating this most advanced open source RDBMS.

The week started with travel, and I got to Frankfurt. Frankfurt will be the proving ground for the idea of a pop-up meet-up. Not an EDB-marketing event or somewhere where we sell EnterpriseDB services, but allow anyone just to discuss PostgreSQL.
We will be in a city, in a public place, answering questions, discussion things or just relax with some coffee. Purpose is to show what the PostgreSQL community is all about, to anyone interested!

The first day in Frankfurt, we spent at the 25hrs hotel. We had some very interesting discussions on:

  • Postgres vs. Oracle community
  • Changing role of DBA:
    • The demise of the Oracle DBA
    • RDBMS DBA not so much
  • Risk management
  • “Data scientist”
  • Significance of relational growing again

In the afternoon we took the Train to Munich, which was a quick and smooth experience. Munich would be the staging ground for a breakfast meeting, or a lunch… or just say hi.

Bruce and I spend the day disc

Posted by Thomas Munro in EnterpriseDB on 2018-04-14 at 10:17
PostgreSQL is a portable RDBMS targeting POSIX systems (and Windows).  It also makes some assumptions about the operating system and hardware it's running on that are not covered by POSIX, but hold on all typical systems.  For example:
  • we assume that 32 bit aligned integers can be read and written atomically; that is, without any kind of synchronisation, you might read an arbitrarily stale value but you won't see a "torn" value with a mixture of bits from the before and after values of a concurrent write
  • we assume that system calls (or at least IPC-related syscalls) synchronise memory; that is, if you write to shared memory and then signal another process, the other process will then be able to read the value
  • we assume that disk blocks of 512 bytes (or some multiple) are written atomically when preallocated; that is, if you lose power and then come back up, you'll either see the old or the new version of a 512-byte block, and not a mixture of bits from the two or other kinds of corruption; specifically, the pg_control file won't be corrupted by torn writes because its size is not allowed to exceed PG_CONTROL_MAX_SAFE_SIZE
On the other hand, it makes some assumptions that are based on (one reading of) POSIX (not to mention the POLA, at least from the vantage point of a humble userland muggle) that turned out not to be entirely reflected in reality on a certain extremely popular OS:
  • we assume that it is safe to write to a file, then close it, and then reopen it from any process and call fsync()
  • we assume that if fsync() reports a failure, then it's safe to retry by calling fsync() again; specifically, if a checkpoint fails because fsync() failed, then we can try to create another checkpoint (for example during shutdown) and a successful fsync() again and know that everything we wrote is on disk

While there seems to be plenty of evidence that calling fsync() from another process does the right thing under normal circumstances, it turns out that write-back errors from buffered IO are not always propagated to u
Some time ago I removed because of some problems with underlying formatting library. Now, it's back. New code, new approach, hopefully usable site. So, previously I was using pgFormatter library, which was using pretty smart regexps to format SQL. But this library was improved in a non-backward-compatible way, and I couldn't really work on […]

Continuing our series of PostgreSQL Data Types today we’re going to introduce date and time based processing functions.

Once the application’s data, or rather the user data is properly stored as timestamp with time zone, PostgreSQL allows implementing all the processing you need to. In this article we dive into a set of examples to help you get started with time based processing in your database. Can we boost your reporting skills?

Posted by Luca Ferrari on 2018-04-13 at 00:00

The local Linux Users’ Group ConoscereLinux is delivering a six-part course on PostgreSQL 10. Guess who’s lecturing…

PostgreSQL 10 short course in Modena

I’m doing a short course on PostgreSQL, with particular regard to the 10 edition, in Modena. Thanks to the local Linux Users’ Group (LUG) /ConoscereLinux/, that provided all the infrastructure for the course, I wiil introduce attendees at basic SQL concepts and how PostgreSQL works.

The course schedule is available at the official course page, and the course will be based on 6 lessons (2 already done). Attendees can come with their own laptops, and lessons will be “live”: I will show concepts while explaining on my own laptop running PostgreSQL 10.1.

All the slides will be available for free on the course page, and are based on my work available on my github repository. The course will take place every week on Tuesday evening. So far the attendees are very interested in PostgreSQL and its technology, and are curious about evaluating all its features as a

I really have to thank the ConoscereLinux LUG, with particular regard to Luca and Massimiliano, for both giving me such chance and, most notably, for waiting me to be ready after my last eye-surgery, and driving me home!

Continuing our series of PostgreSQL Data Types today we’re going to introduce date, timestamp, and interval data types.

PostgreSQL implementation of the calendar is very good, and we’re going to show some mice example about how confusing this matter is. The time zone notion in particular is mainly a political tool these days, and it makes no sense on an engineering principle: there’s no way to solve time zone problems from first hand principles!

Posted by Laurenz Albe in Cybertec on 2018-04-12 at 08:20
Another way to get rid of unused indexes...
© Laurenz Albe 2018

Why should I get rid of unused indexes?

Everybody knows that a database index is a good thing because it can speed up SQL queries. But this does not come for free.

The disadvantages of indexes are:

  • Indexes use up space. It is not unusual for database indexes to use as much storage space as the data themselves. And the kind of reliable, fast storage you want for a database is not necessarily cheap.
    The space used up by indexes also increases the size and duration of physical backups.
  • Indexes slow down data modification. Whenever you INSERT into or DELETE from a table, all indexes have to be modified, in addition to the table itself (the “heap”).
    And it is much more expensive to modify the complicated data structure of an index than the heap itself, which has its name precisely because it is basically an unordered “pile” of data (and as everybody knows, maintaining order is more work than having a mess). Modifying an indexed table can easily be an order of magnitude more expensive than modifying an unindexed table.
  • Indexes prevent HOT updates. Because of the architecture of PostgreSQL, every UPDATE causes a new row version (“tuple”) to be written, and that causes a new entry in every index on the table.
    This behavior has been dubbed “write amplification” and has drawn a lot of fire. This undesirable effect can be avoided if a) the new tuple fits into the same table block as the old one and b) no indexed row is modified. Then PostgreSQL creates the new tuple as a “Heap Only Tuple” (hence HOT), which is much more efficient and also reduces the work VACUUM has to do.

The many uses of indexes

Now we know that we don’t want unnecessary indexes. The problem is that indexes serve so many purposes that it is difficult to determine if a certain index is needed or not.

Here is a list of all benefits of indexes in PostgreSQL:

  1. Indexes can speed up queries that use indexed columns (or expressions) in the WHERE clause.
    Everybody knows that one!
    The traditional B-tree index supports the <, <=, =, >= and >
Posted by Luca Ferrari on 2018-04-12 at 00:00

Today a friend of mine asked me for a trouble between PostgreSQL 10 and Python. Since I’m not a pythonist, my quick answer was that PostgreSQL 10 does support Python 3. But it turned out it was not so simple (at least, not so simple as to get Perl 5 working!).

PostgreSQL 10, Python 3 and FreeBSD (and Ubuntu)

tl;dr it can be done (of course!)

postgres=# SELECT pyv(); pyv ---------------------------------------------------------------------------- 3.6.4 (default, Jan 2 2018, 01:25:35) + [GCC 4.2.1 Compatible FreeBSD Clang 4.0.0 (tags/RELEASE_400/final 297347)] postgres=# SELECT proname, prolang, prosrc FROM pg_proc WHERE proname = 'pyv'; -[ RECORD 1 ]----------------- proname | pyv prolang | 16387 prosrc | + | import sys + | return sys.version+ | 

My default environment for running PostgreSQL is FreeBSD, so in order to get plpython working I jumped to the console and installed the package postgresql10-plpython-10.3, thinking of course that FreeBSD would do the right thing. Unluckily it did not!

Creating a plpython3u language did not succeed, and the problem was that the above package installed only the under the lib directory (e.g., /usr/local/lib/postgresql). I then tried installing the port of the very same name, but again it was installing...

Continuing our series of PostgreSQL Data Types today we’re going to introduce some of the PostgreSQL text processing functions.

There’s a very rich set of PostgreSQL functions to process text — you can find them all in the string functions and operators documentation chapter — with functions such as overlay(), substring(), position() or trim(). Or aggregates such as string_agg(). There are also regular expression functions, including the very powerful regexp_split_to_table(). In this article we see practical example putting them in practice.

In support of our theme of "People, Postgres, Data", the PostgresConf US 2018 hosts its 2nd annual Talent Exchange & Career Fair on Friday, April 20, from 10:30 am - 1:30 pm. This event is free for job seekers and open to the public.
Companies represented include:
  • Amazon
  • Pivotal
  • 2ndQuadrant
  • Microsoft
  • Timescale
  • HighGo
  • Data Dog
  • Fivetran
  • BlueTalon
  • Goldman Sachs
  • forhims
  • Immuta
Last year, the Career Fair brought together employers and candidates in one of the hottest career options available today, Postgres DBA.
New for this year -- Career Mentors and Resume Reviewers
Check out the full schedule for PostgresConf US 2018, and buy your tickets soon!

The feature freeze for the PostgreSQL 11 release is now upon us. During the last few days my colleague Álvaro Herrera pushed two changes into the development branch of PostgreSQL:

1. Faster Partition Pruning
2. Partition Pruning at Execution Time

These patches aim to improve the performance and usability of the declarative table partitioning feature (added in PostgreSQL 10). Amit Langote wrote the first of these two patches, with some assistance from me. I’m the author of the second patch. This one is based on an original patch by Beena Emerson.


Internally in PostgreSQL, a partitioned table is made up from a series of individual tables. These tables are all grouped under one common parent partitioned table. Queries being run against the partitioned table need the results of each individual table to be “concatenated” before the final result is produced. Many queries, especially OLTP type queries, will only require data from a small number of partitions (perhaps just 1!).   In order to save PostgreSQL from having to needlessly trawl through all partitions for data that might not even be there, PostgreSQL tries to eliminate partitions that won’t contain any needed records.
In PostgreSQL 10 this elimination took place via the “constraint_exclusion” mechanism, which was a linear algorithm that required looking at each partition’s metadata one-by-one to check if the partition matched the queries WHERE clause.

Faster Partition Pruning

In PostgreSQL 11 this elimination of unneeded partitions (aka partition pruning) is no longer an exhaustive linear search.  A binary search quickly identifies matching LIST and RANGE partitions. A hashing function finds the matching partitions for HASH partitioned tables, which are new in PG11.

This also makes improvements so that it’s able to prune partitions in a few more cases than was previously possible.

Partition Pruning at Execution Time

Required partitions are normally identified only by the query planner, however, it’s only able to perform this identification process us


Going into production is a very important task that must be carefully thought and planned beforehand. Some not so good decisions may be easily corrected afterwards, but some others not. So it is always better to spend that extra time in reading the official docs, books and research made by others early, than be sorry later. This is true for most computer systems deployments, and PostgreSQL is no exception.

System Initial Planning

Some decisions must be taken early on, before the system goes live. The PostgreSQL DBA must answer a number of questions: Will the DB run on bare metal, VMs or even containerized? Will it run on the organization’s premises or in the cloud? Which OS will be used? Is the storage going to be of spinning disks type or SSDs? For each scenario or decision, there are pros and cons and the final call will be made in cooperation with the stakeholders according to the organization’s requirements. Traditionally people used to run PostgreSQL on bare metal, but this has changed dramatically in the recent years with more and more cloud providers offering PostgreSQL as a standard option, which is a sign of the wide adoption and a result of increasing popularity of PostgreSQL. Independently of the specific solution, the DBA must ensure that the data will be safe, meaning that the database will be able to survive crashes, and this is the No1 criterion when making decisions about hardware and storage. So this brings us to the first tip!

Tip 1

No matter what the disk controller or disk manufacturer or cloud storage provider advertises, you should always make sure that the storage does not lie about fsync. Once fsync returns OK, the data should be safe on the medium no matter what happens afterwards (crash, power failure, etc). One nice tool that will help you test the reliability of your disks’ write-back cache is

Just read the notes: and do the test.

Use one machine to listen to events and the actual machine to test. You should see:

 verifying: 0