The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Posted by Shaun M. Thomas on 2016-06-24 at 17:35

On a higher level, Postgres has a bevy of libraries, interfaces, and clients for accessing a database instance. From language APIs to GUIs like pgAdmin, or SaaS entries like JackDB, every flavor of interaction is covered. And yet, that’s only a small part of the story. For those who dare to tread into the watery depths, there’s also the world of dark incantations that is the command-line.

While most are aware of psql, the Postgres command-line client for accessing databases, there are far more creatures lurking in the black oblivion which deserve more visibility. In the spirit of getting the most from our tools, let’s take a look at these and consider how they might be useful in scenarios that aren’t restricted to boring old DBA considerations. To that end, let’s cover some of the standard client applications that anyone might use.

To begin, there are several commands that cover some standard SQL operations. Things like creating or dropping databases, users, or procedural languages, are all possible without actually connecting to Postgres. Normally, we’d have to use the equivalent SQL commands:

CREATE USER kitty_cat WITH PASSWORD 'M30w9|_|rr';
DROP USER kitty_cat;
DROP LANGUAGE plpythonu;

While these are all possible to use by forcing them through psql -c, that makes automation testing and other tasks somewhat awkward, or force sysadmins to learn SQL for no reason. So Postgres provides functional wrappers we can use instead. They’re named about what we might expect, too:

createdb foo
dropdb foo
createuser kitty_cat --pwprompt
dropuser kitty_cat
createlang plpythonu
droplang plpythonu

As would be expected, all of these commands include the optional parameters from their SQL equivalents. This means a couple of handy things: flags can be listed with -? or --help, and longer descriptions are available via comprehensive man pages. The Postgres online documentation is great, and having all of this information at our fingertip


My slides from my presentation and PgConf.US 2016:


We have recently published a patch to provide full database encryption for PostgreSQL. Many business critical applications require full encryption to match legal or business requirements. As more and more customers asked for this, we decided to tackle this important issue. The idea behind our patch is to store data on disk in encrypted format […]

The post PostgreSQL instance-level encryption appeared first on Cybertec - The PostgreSQL Database Company.

Approximation in databases seems to be an alien concept at first. But if your application can deal with a known and controllable degree of error, it can help even in cases where conventional tuning is not an option for whatever reason.

Approximation is not evil

One of the top requirements for database systems is reliability. Whether you run a bank or a large retail business, you don't want to lose a cent here or there or charge your customer twice for the Pink Fluffy Unicorn he just bought, just because the DBMS gave a wrong answer. Classic OLTP operations have to be always 100% correct.

However, for the case of analytics, things become different. In some cases, it can be desirable to trade a bit of accuracy for a lot of speed. This is called approximation and to many database people (and users), the concept of accepting results with less than 100% accuracy seems strange at first.

But if you know - and can control - the error introduced by approximation, it is not. It can even be very useful, if a 95% accurate answer now is worth more than a 100% accurate answer tomorrow.

Welcome to approximation in PostgreSQL 9.5.

Approximating queries

Approximate queries work on subsets, called samples, of the whole data set, called the population.
If the sampling is done statistically correct, a sample much smaller than the whole population gives answers close to the real answer within a known error range.

A possible application for the hypothetical retail business would be to find which product is currently trending.
Instead of knowing that exactly 500, 1000 and 2000 Pink Fluffy Unicorns were sold in the last three weeks, knowing that 498, 1001 and 1999 Pink Fluffy Unicorns were sold in the last three weeks with let's say 5% error tells the procurement people that Pink Fluffy Unicorns are a trending product just as fine as the exact numbers. Only, they might have to wait a few seconds for the answer instead of a few hours...

PostgreSQL 9.5 has built-in support for approximate queries. Because I'm lazy and already wrote a

PostgreSQL 9.6 is shipping with a new contrib module manipulating and giving some input on the visibility map of a relation:

Add pg_visibility contrib module.

This lets you examine the visibility map as well as page-level
visibility information.  I initially wrote it as a debugging aid,
but was encouraged to polish it for commit.

Patch by me, reviewed by Masahiko Sawada.


The visibility map, associated to a relation in its own file, which is named with the suffix _vm, tracks information related to the visibility of tuples on relation pages for each backend. Up to 9.5, 1 bit was used per heap page, meaning that if this bit is set all the tuples stored on this page are visible to all the transactions. In 9.6, 2 bits are being used, the extra bit added is used to track if all tuples on a given page have been frozen or not, critically improving VACUUM performance by preventing full table scans.

pg_visibility contains a couple of functions allowing one to get a look at the status of each page’s bits. The first one, and aimed at general purposes, gives an insight about the all-frozen and all-visible status for each page of a relation, plus the status of PD_ALL_VISIBLE which is the same information as the all-visible flag except that it is stored in the heap page itself and not the VM file:

=# CREATE TABLE tab_visible AS SELECT generate_series(1,1000) AS id;
=# SELECT * FROM pg_visibility('tab_visible'::regclass);
 blkno | all_visible | all_frozen | pd_all_visible
     0 | f           | f          | f
     1 | f           | f          | f
     2 | f           | f          | f
     3 | f           | f          | f
     4 | f           | f          | f
(5 rows)

This function can take an optional argument in the shape of a block number. pg_visibility_map is similar to the previous function, except that it does not scan the all-visible flag value on the page and it just fetches what is available on the visi

Posted by Ian Barwick in 2ndQuadrant on 2016-06-22 at 07:16

Greetings, honourable reader.

My name is Barwick of the 2ndQuadrant Company. With the onset of the Plum Rain, in my humble estimation it has become hot and humid recently. I trust all is well in your exalted undertakings?

No, I haven’t flipped a bit – the above is actually pretty much how a Japanese business letter – or email – starts off, and there’s a plethora of sites such as this one providing examples of various kinds of formal email. But fear not, this won’t be a lesson in written Japanese business etiquette, but the first in an occasional series of posts from Japan, a country which embraced PostgreSQL very early on and has contributed much to its development.

A potted PostgreSQL history, Japanese style

Back in 2006, on a visit to Japan before I started living here, I happened to wander into a bookstore and was looking for something on handling Japanese full-text search (a tricky subject in a language which has three or four different writing systems and multiple ways of writing the same word but no spaces), when something odd struck me – there were clearly more PostgreSQL books than MySQL ones. This was at a time when PostgreSQL was globally still in a bit of a niche and an equivalent bookshop in Europe would feature a couple of often dated volumes
cowering next to masses of dolphin-orientated literature.

Occasionally it’s postulated that PostgreSQL’s popularity in Japan is because it’s easier to pronounce than MySQL – this is a country where “Los Angeles” is abbreviated to “Rosu“, so it’s no surprise that the even more cumbersome rendering of PostgreSQL (“posutogure-esu-kyuu-eru“) is shortened to “posugure” (look it up on Google if you don’t believe me) – whereas the equivalent “mai” is short and ambiguous to use in normal conversation.

However the reality is more prosaic – during the mid/late 1990s, as the internet was taking off and open source databases were becoming viable, PostgreSQL got a head start in the accessibility stakes. A Japanese-language mailing list was set up as far back as Postgres95 an


Since BRIN indexes have been introduced in PostgreSQL 9.5, many people have gladly adopted this new index type. A lot has been written about this new feature and a lot of positive feedback has been reported. While BRIN indexes are clearly a success and definitely a win, some people tend to exagerate and use them […]

The post BRIN indexes: Correlation, correlation, correlation appeared first on Cybertec - The PostgreSQL Database Company.

Posted by gabrielle roth on 2016-06-21 at 01:36
Last year I commented that “autovacuum log messages are still missing in [RDS] 9.4“. Amazon fixed that for versions 9.4.5 and up with the custom GUC rds.force_autovacuum_logging_level. I discovered this week that it’s also available on earlier versions of RDS – I have it enabled on a 9.3.10 instance. The default value in the config […]
Posted by Joe Abbate on 2016-06-20 at 20:54

When I started working on Pyrseas, I reviewed several other products. Robert Brewer’s Post Facto was probably the one with the most unique design. Although it compared database schemas in order to generate SQL to synch them up, it did not store database object definitions in a standard VCS repository. Rather, it used a Postgres database as the repository analog.

While Post Facto’s design certainly influenced Pyrseas, there is one aspect of  the former that, unfortunately, I did not emulate.

The Dependables

As any developer knows, database objects have dependencies on each other: table A has a primary key PK1, table B is declared with a foreign key dependent on PK1, function C is dependent on type X, view D is based on table A and includes a call to function C.

Pyrseas currently deals with these dependencies in an object-specific manner. For example, it does at least two passes through pg_class objects (tables, views, sequences, etc.) in order to create, alter or drop these objects in the correct order. However, this ad hoc approach can result in incorrect sequencing of generated SQL statements in some cases, particularly those like view D above.

The missing feature from Post Facto that avoids this conundrum? If you answered topological sort you were obviously paying attention in your Algorithms class. If you didn’t, may I suggest chapter 15, “Devising and engineering an algorithm: Topological Sort” of Bertrand Meyer’s Touch of Class.

Daniele’s Quest

Over two years ago, someone opened an issue about the need to create primary keys before creating views. Later, Daniele Varrazzo reported another issue with dependencies.

Many of you Postgres users will recognize Daniele as the maintainer of Psycopg, the popular Python PG adapter, which of course is used by Pyrseas.  Daniele and I chatted online, I mentioned Post Facto’s solution and he, fortuitously and generously, started implementing a topological sort on a deptrack branch of Pyrseas.

We then collaborated for about eight months. He did most of the initial coding

Posted by Colin Copeland on 2016-06-20 at 11:30

The Django 1.8 release added support for complex query expressions. The documentation has some nice examples but they don't do justice to how crazy awesome these are. In this post, we will go through some additional examples of how to leverage these expressions.

Django has had one form of a query expression for several years now: the F expression. F() can be used to reference an existing column in a query. This is often used for atomic update operations, such as incrementing a counter. However, F can also be used to compare two columns on a model when filtering. For instance, we may be interested in users who haven't logged in since their first two weeks on the site. That requires comparing the value of the last_login column and the date_joined on the standard User model from contrib.auth:

from datetime import timedelta

from django.contrib.auth.models import User
from django.db.models import F
from django.utils.timezone import now

# Create some fake data: 10 active users and 20 inactive ones
today = now()
active_count = 10
inactive_count = 20
for i in range(1, active_count + inactive_count + 1):
    active = i <= active_count
    prefix = 'in' if active else ''
    domain = '' if i % 3 == 0 else ''
    attributes = {
        'username': '{}active-{}'.format(prefix, i),
        'email': '{}active-{}@{}'.format(prefix, i, domain),
        'date_joined': today - timedelta(days=30),
        'last_login': today - timedelta(days=0 if active else 21),
# Query inactive users
inactive = User.objects.filter(last_login__lte=F('date_joined') + timedelta(days=14))

The F expression supports basic arithmetic operations including some date math, as seen in the example above. However, it is still very limiting in comparison to what is available in SQL.

Relational databases such as Postgresql support a number of built-in functions which you can leverage in the ORM using the Func expression added in 1.8. For example, you may want to examine the email doma


Hi pgpool-II users,

I've managed to come back to this blog after 3 years blank.

Last week pgpool-II developer team released minor versions from 3.1 to 3.5. This release includes special git for users: enhancement for health checking.

You might notice an annoying behavior of pgpool-II.

Example: suppose we have three PostgreSQL backends managed by pgpool-II. pgpool-II occasionally checks healthiness of each backend if "health check" is enabled. If backend #2 goes down, a fail over is triggered and after that, users can use the DB server cluster without backend #2. This is great. However, if you set up the retrying of health checking, clients cannot connect to pgpool-II while it is retrying the health check. For instance,

health_check_max_retries = 10
health_check_retry_delay = 6

will continue the health check retry for 10*6 = 60 seconds at least. This is very annoying for users because they cannot initiate new connections to pgpool-II for 1 minute. Making these parameters shorter might mitigate the situation a little bit but this may not be useful if the network is not stable and longer retries are desirable.

These new releases significantly enhance the situation. By setting:

fail_over_on_backend_error = off

when a user connects to pgpool-II while it is doing health checking, it starts to connect to all backends including #2. Before this release, pgpool-II gave up initiating session if one of backend is not available (in this case #2). With this release, pgpool-II skips the broken backend and continues to connect to rest of backends. Please note, however, this feature is only available when all of conditions below are met:

  • streaming replication mode
  • the broken backend is not primary server
  • fail_over_on_backend_error  is off

This enhancement is  available all of the new releases: 3.5.3, 3.4.7, 3.3.11, 3.2.16, 3.1.19.


PostgreSQL 9.6 has extended the traditional framework available for physical backups by allowing users to take backups concurrently. Barman will transparently support this new set of functions without requiring the pgespresso extension.

The pgespresso extension, conceived by our Simon Riggs, allowed marking the start and the stop of a backup process even on a read-only standby server. Through pgespresso, Barman users could take physical backups through rsync/Ssh from a standby server, off-loading the actual copy of files from the master server.

This feature is called concurrent backup, and is already available in PostgreSQL through streaming replication protocol, via pg_basebackup.

The last version that pgespresso will support in terms of concurrent backup is PostgreSQL 9.6. Why is that?

The available PostgreSQL API for taking low-level physical backups has been extended in order to natively support concurrent backups (rather, I should use the term “non-exclusive”). I am not sure if Magnus (the author of this patch) was inspired by pgespresso or not, but what is important is that his contribution is definitely more generic and robust (after all, pgespresso was designed to interact only with Barman).

Therefore, PostgreSQL 9.6 and future versions will natively have functions that allow Barman to request a concurrent backup, making pgespresso not necessary.

For more details about this new API, please refer to the section that I have written in the PostgreSQL Wiki page “What’s new in PostgreSQL 9.6″. For now, it is important to know that:

  • pg_start_backup() has been overridden and a new parameter specifies whether the backup is exclusive (default) or not;
  • a new version of pg_stop_backup() has been provided for concurrent backups: from a technical point of view, this function will now return the content of the backup label and the map of available tablespaces.

What is relevant to our Barman users is that Barman will transparently handle this new API, with no impact on the user experience whatsoever.

By default, B

Hope you found my two previous posts on transaction interesting and useful. Based on some of readers request, I would cover few advanced transaction topic in my subsequent posts. In this post, I am going to cover following advanced variant of traditional transaction:
     1. Savepoint
     2. Prepared Transaction


In order to understand its usage, imagine one of your customer work-load as below:
            Perform operation-1
            Perform operation-2
            Perform operation-3
Perform operation-4

As per the business logic it is likely that operation-3 and operation-4 are going to fail and also failure of these operations does not impact operation-1 and operation-2. Now as per transaction properties (recall the Basic of Transaction), whole of operation will be roll backed in-case of any of the four operation fails. So in the given business scenario though operation-1 and operation-2 should not be rollbacked but still it will be. 

In order to avoid this behavior  and control the main transaction life span, savepoint is being introduced.

Savepoint can be created only inside a transaction block. It creates a saving point for overall transaction i.e in-case if any commands results in error inside a transaction block, then instead of rollback whole transaction, it allows to rollback till the point where savepoint was created. E.g. Consider following example:

        INSERT INTO TBL VALUES(100, 100);
        INSERT INTO TBL VALUES(200, 200);
        SAVEPOINT first_save;
        INSERT INTO TBL VALUES(200, 300); <=== This will fail because of unique index.

Related Comamnds

1. SAVEPOINT savepoint_name: This command create a new savepoint inside a transaction block as shown in above example. There can be nested savepoint also i.e. savepoint can be created with-in already created savepoint.

2. ROLLBACK TO [SAVEPOINT ] savepoint_name: T
Posted by Shaun M. Thomas on 2016-06-17 at 18:26

One way the Postgres project is subtly misleading, is that it becomes easy to forget that not all other projects are nearly as well managed. This becomes more relevant when delving into niches that lack sufficient visibility to expose the more obvious deficiencies. As much as we like Postgres, it’s not quite as popular as it could be. This makes some of the side projects infrequently used, and as a direct consequence, they can often resemble jerky automatons cobbled together out of spit and bailing wire.

A good example of this is the hdfs_fdw extension for accessing Hadoop. To be fair, a large portion of its complexity is due to dependency on Hive and hastily assembled APIs like Thrift and fb303. Unfortunately it also suffers from “Works for Me” syndrome, lacking an autoconf to autodiscover these requirements, and it also doesn’t use automake to properly bootstrap the make environment. As a result, most builds will fail outright or require hand-modifying the Makefile—a task many will simply balk at before abandoning the extension outright.

So, let’s install the hdfs_fdw extension, going through all the necessary steps, and not just some incomplete shorthand that makes assumptions regarding the build system. To facilitate this, no default install locations will be used at all, because hardcoded defaults are how we got into this mess in the first place.

The hdfs_fdw extension depends on Thrift, so let’s start there. As of this writing, the latest version is 0.9.3. Here’s the full build process we used:

tar -xzf thrift-0.9.3.tar.gz
cd thrift-0.9.3
./configure --prefix=/opt/thrift
make -j2
sudo make install

So far, this is pretty normal. What isn’t normal, is that the thrift source includes a contrib module named fb303 we also need. Unlike the Postgres build environment, configuration settings and subsequent Makefile components do not cascade to the contrib modules. This is where the trouble begins.

The first issue is that, unlike Thrift, fb303 d


Faceted search is very popular buzzword nowadays. In short, faceted search specialty is that its results are organized per category. Popular search engines are receiving special support of faceted search.

Let’s see what PostgreSQL can do in this field. At first, let’s formalize our task. For each category which have matching documents we want to obtain:

  • Total number of matching documents;
  • TOP N matching documents.

For sure, it’s possible to query such data using multiple per category SQL queries. But we’ll make it in a single SQL query. That also would be faster in majority of cases. The query below implements faceted search over PostgreSQL mailing lists archives using window functions and CTE. Usage of window function is essential while CTE was used for better query readability.

psql Faceted search SQL query /* * Select all matching messages, calculate rank within list and total count * within list using window functions. */ WITH msg AS ( SELECT message_id, subject, list, RANK() OVER ( PARTITION BY list ORDER BY ts_rank_cd(body_tsvector, plainto_tsquery('index bloat')), id ) rank, COUNT(*) OVER (PARTITION BY list) cnt FROM messages WHERE body_tsvector @@ plainto_tsquery('index bloat') ), /* Aggregate messages and count per list into json. */ lst AS ( SELECT list, jsonb_build_object( 'count', cnt, 'results', jsonb_agg( jsonb_build_object( 'message_id', message_id, 'subject', subject ))) AS data FROM msg WHERE rank <= 5 GROUP by list, cnt ) /* Aggregate per list data into single json */ SELECT jsonb_object_agg(list, data) FROM lst;

The resulting JSON document contains total count of matching mailing list messages and TOP 5 relevant messages for each list.

js Faceted search JSON result { "pgsql-admin": { "count": 263, "results": [ {"message_id": "", "subject": "Re: Slow planning time"}, {"message_id": "", "subject": "Re: Finetuning Autovacuum"}, {"message_id": "AANLkTikWabMzCRCSWuNLuPizSSQX


The WAL compression feature With the launch of Postgres 9.5 a new setting called „wal_compression“ was introduced, that should decrease the IO load on behalf of CPU load. And that is a desirable compromise considering typical constraints of modern hardware. But somehow the feature didn’t get the publicity it deserved – it wasn’t even mentioned in the What’s new roundup and by default […]

The post PostgreSQL underused features – WAL compression appeared first on Cybertec - The PostgreSQL Database Company.

COPY into distributed PostgreSQL tables, up to ~7M rows/sec

In the recent 5.1 release of the Citus extension for PostgreSQL, we added the ability to use the COPY command to load data into distributed tables. PostgreSQL's COPY command is one of the most powerful bulk loading features of any database and it is even more powerful on distributed tables. 

To get a sense of the achievable ingestion rate for COPY on distributed tables, I set up a Citus cluster on EC2 with a c4.8xlarge master node and 4 i2.4xlarge workers, if you want to get started more quickly you can use Citus Cloud to immediately provision a fully managed Citus cluster on top of AWS. I used the US names dataset containing ~5.7 million rows and created a distributed table as follows:

  state text,
  gender text,
  year int,
  name text,
  number int
SELECT master_create_distributed_table('names','name','hash');
SELECT master_create_worker_shards('names',16,1);

To get the best ingestion rate, I split the input into 16 chunks and loaded them in parallel using the following commands:

split -n l/16 names.csv chunks/
find chunks/ -type f | time xargs -n1 -P16 sh -c "psql -c \"\\COPY names FROM '\$0' WITH (FORMAT CSV)\""

I repeated the the data loading command multiple times for different set-ups, local/distributed, with/without replication, and with/without an index on the year, and put the results in the table below. 

Table type Index Replication Ingestion rate
Local No N/A 1.5M rows/sec
Local Yes N/A 0.5M rows/sec
Distributed No No 7M rows/sec
Distributed No Yes 5M rows/sec
Distributed Yes No 2.5M rows/sec
Distributed Yes Yes 1.2M rows/sec


In the fastest set-up, Citus loaded up to 7 million rows/sec (finished in under a second). In the most realistic set-up, with replication and an index on the distributed table, COPY still achieves over 1M rows/sec for the names dataset. We recently saw a sustained 1.5M rows/sec in a production cluster, loading 10 billions of rows in a few hours.

Ingestion rate depends

Posted by Simon Riggs in 2ndQuadrant on 2016-06-15 at 17:16


How do you tell what version of PostgreSQL you are using?
What is the version number of your PostgreSQL server?
What’s the release of your Postgres server?
What release is your Postgres database?
How do I write a program that acts differently for different versions of the PostgreSQL server?


For a long time it’s been possible to write a query to find out what version your server is.

SELECT version();

or you could look at that information by the preset parameter

SHOW server_version;

Both of those mechanisms refer to an internal string, accessible to C programs as PG_VERSION

Since PostgreSQL 8.2 we’ve been able to access the version number in a numeric form, allowing us to write programs that depend upon version number for any server version from 8.2 upwards to current versions and later.

SHOW server_version_num;

You can use that more easily within an SQL query like this

SELECT current_setting(‘server_version_num’);

postgres=# SELECT current_setting('server_version_num');
(1 row)

Or you can access this value more easily using the PostgreSQL C API, PQserverVersion()

int PQserverVersion(const PGconn *conn);

These mechanisms access the internal macro PG_VERSION_NUM

These numeric values are much easier to use programmatically, allowing you to write programs that vary their behaviour across releases.

e.g. if (version < 90500)

This is mainly useful for accessing in-core functions or in-core Catalog Views.

These versions will be maintained when we go to release 10.0, so we expect this…

postgres=# SELECT current_setting('server_version_num');
(1 row)

so that 10.0 > 9.6 numerically.

A good example of such usage is in for marking your SQL functions safe to use with the new Parallel Query feature in 9.6 and above.

DO $$
 DECLARE ver integer;
  SELECT current_setting('server_version_num') INTO ver;
  IF (ver >= 90600) THEN
   EXECUTE ‘alter function f7() 
Posted by Bruce Momjian in EnterpriseDB on 2016-06-15 at 15:30

Prepared statements are queries where the constants used in the query are separated from the query text. While this improves security by avoiding SQL injection attacks, it also allows repeatedly-executed queries to avoid parsing and planning overhead by executing saved generic plans that represent queries with typical constants. While generic plans don't have the advantage of being planned based on the statistics of specific constants, the avoidance of parsing and planning overhead is often more beneficial.

Before Postgres 9.2, generic plans were used for all prepared queries. In Postgres 9.2, logic was added to use a generic plan only if it has a cheaper cost after five or more executions. Unfortunately, this behavior was undocumented, causing confusion for users surprised to find the EXPLAIN plans of prepared queries changing after five executions, sometimes for the worse.

After much discussion, this has been remedied by improving the Notes section of the 9.6 PREPARE manual page. It now explains how generic plans are created, and when they are used. I have also created an SQL script that, when run through psql (output), illustrates the documented behavior.

Continue Reading »

Dealing with partitioned tables we can’t always select relevant partitions during query planning. Naturally, during query planning you can’t know values which come from subquery or outer part of nested loop join. Nevertheless, it would be ridiculous to scan all the partitions in such cases.

This is why my Postgres Professional colleague Dmitry Ivanov developed a new custom executor node for pg_pathman: RuntimeAppend. This node behaves like regular Append node: it contains set of children Nodes which should be appended. However, RuntimeAppend have one distinction: each run it selects only relevant children to append basing on parameter values.

Let’s consider example: join of journal table which contains row per each 30 seconds of year partitioned by day, and q table which refers 1000 random rows of journal table. Without RuntimeAppend optimizer selects Hash Join plan.

sql Regular Append: Hash Join # EXPLAIN ANALYZE SELECT * FROM q JOIN journal j ON q.dt = j.dt; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=27.50..25442.51 rows=1000 width=56) (actual time=0.479..252.506 rows=1000 loops=1) Hash Cond: (j.dt = q.dt) -> Append (cost=0.00..21463.01 rows=1051201 width=49) (actual time=0.005..152.258 rows=1051201 loops=1) -> Seq Scan on journal_1 j (cost=0.00..58.80 rows=2880 width=49) (actual time=0.004..0.247 rows=2880 loops=1) -> Seq Scan on journal_2 j_1 (cost=0.00..58.80 rows=2880 width=49) (actual time=0.001..0.208 rows=2880 loops=1) -> Seq Scan on journal_3 j_2 (cost=0.00..58.80 rows=2880 width=49) (actual time=0.001..0.197 rows=2880 loops=1) ............................................................................................................................... -> Seq Scan on journal_366 j_365 (cost=0.00..1.01 rows=1 width=49) (actual time=0.001..0.001 rows=1 loops=1) -> Hash (cost=15.00..15.00 rows=1000 width=8) (actual time=0.185..0.185 rows=1000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 48kB

Some time ago someone on irc asked about creating fast counters for something (banners I think). I talked with her (him?) about it, but figured, I can as well write a blogpost, so others can use it too. First the general idea: we want to store incrementable counters, with optional trade off that “we can […]

(A Unicode rabbit face 🐰 will never be as cute
as this real bunny. Photo by Wade Simmons)

One of our clients recently reached out to us for help in upgrading their Postgres database. The use of the pg_upgrade program was not an option, primarily because the client was also taking the opportunity to change from their SQL_ASCII encoding to UTF-8. (If any of your databases, gentle reader, are still SQL_ASCII, please do the same!). Naturally, we also took advantage of the lack of pg_upgrade to enable the use of data checksums, another action we highly recommend. Although there were plenty of wrinkles, and stories to be told about this migration/upgrade, I wanted to focus on one particular problem we had: how to detect if a table has changed.

We needed to know if any applications were modifying certain tables because the speed of the migration was very important. If we could assert that no changes were made, there were some shortcuts available that would greatly speed things up. Initial testing showed that the migration was taking over eight hours, a time unacceptable to the client (no worries, we eventually reduced the time to under an hour!).

Looking closer, we found that over half that time was spent converting a single small (50MB) table from SQL_ASCII to UTF-8. How this conversion was performed is a story for another day, but suffice to say the table had some really, really messy bytes inside of it; the conversion program had to struggle mightily. When you are converting a database to a new encoding, it is imperative to examine every byte and make sure it gets changed to a format that Postgres will accept as valid UTF-8, or the entire table import will fail with an error similar to this:

ERROR:  invalid byte sequence for encoding "UTF8": 0xf4 0xa5 0xa3 0xa5

Looking closer at the data in the table showed that it might - just might! - be a historical table. In other words, it no longer receives updates, just selects. We really wanted this to be true, for it meant we could dump the whole table, convert it, and s

Posted by Kaarel Moppel in Cybertec on 2016-06-13 at 08:55

Recently there have been a couple of very exhaustive and rather technical posts on Planet Postgresql on the topic of bloat. Bloat in short, for those new to Postgres,  is an unescapable byproduct of Postgres‘ excellent concurrency-enablement model called MVCC (multi-version concurrency control), that creates new copies of rows when you update data in a concurrent […]

The post Estimating table bloat in PostgreSQL appeared first on Cybertec - The PostgreSQL Database Company.

While rewriting reloption.c code, I came to an idea, that all string reloptions that are used in the code are actually enum reloptions: just a fixed list of string constants, nothing more.

One string option is gist buffering, with the following validate function:
gistValidateBufferingOption(char *value)
     if (value == NULL ||
         (strcmp(value, "on") != 0 &&
          strcmp(value, "off") != 0 &&
          strcmp(value, "auto") != 0))
                  errmsg("invalid value for \"buffering\" option"),
               errdetail("Valid values are \"on\", \"off\", and \"auto\".")));

Another one is check_option for views. That also a list of two options

validateWithCheckOption(char *value)
     if (value == NULL ||
         (pg_strcasecmp(value, "local") != 0 &&
          pg_strcasecmp(value, "cascaded") != 0))
                  errmsg("invalid value for \"check_option\" option"),
                  errdetail("Valid values are \"local\" and \"cascaded\".")));

So the first idea, that came to me was to add a enum type, and switch the code to use it instead of string.

The second idea was, to get rid of string type at all. Because when parsing of reloptions is done, all the data is stored as a bytea binary chunk, that is copied from one memory context to another. So if there is no string data there, you can just work with it as with C structure. But if you have a string data, you should somehow put it into that data chunk, because otherwise it would not be copied from one memory context to another. This all require hacks that are not beautiful at all. And as I think should be avoided if possible. For example, this function takes a base size of reloption structure and adds some more space for each string option. I do not like this code:

void *
allocateReloptStruct(Size base, relopt_value *options,
Posted by Shaun M. Thomas on 2016-06-10 at 19:01

There comes a day in every young database’s life that it’s time to move on. I’m sorry 9.4, but the day has come that we must say goodbye. It’s not like we haven’t had our good times. While I truly appreciate everything you’ve done for me, we must part ways. I’m far too needy, and I can’t demand so much of you in good conscience. May your future patches make you and your other suitors happy!

In all seriousness, Postgres 9.5 has been out since January 7th. Despite that, I’ve never really been an advocate of deploying dot-0 versions in production environments. Since the Postgres dev team is notoriously fanatically responsible, this is probably excessively paranoid. Still, I persist that some day, my certifiable anxiety regarding fresh releases will pay off. Until then, it’s time to examine a few reasons to start upgrading the 120+ Postgres instances littering our server fabric. Especially now that 9.5.3 is out, making it practically rock solid in terms of field testing.


A common complaint about JSON and JSONB up to 9.4, was that there was no easy way to modify existing JSON objects. Adding fields was an exercise at text conversion, unpacking, concatenation, and re-casting. It was a mess. As a consequence, I recommended using it as a mere container column, or relying on PL/V8 or PL/Python to actually manipulate JSON data.

That’s no longer the case with 9.5. Not only are several standard operators overloaded to support JSONB, but they introduced the jsonb_set function for more direct manipulation. I covered this more extensively in the past, but here are a couple quick examples for reference:

-- Concatenation
SELECT '{"Hairy": true}'::JSONB || '{"Wobbly": false}'::JSONB;
 {"Hairy": TRUE, "Wobbly": FALSE}
-- Subtraction 
SELECT '{"Hairy": true, "Excited": false}'::JSONB - 'Excited';
 {"Hairy": TRUE}
-- Explicit jsonb_set
SELECT jsonb_set(
  '{"Hairy": true, "Status": {"Burning": tru
Posted by Joe Abbate on 2016-06-09 at 18:51

In the early days of the Pyrseas project, I read about some open source projects being abandoned due to their developers losing interest or perhaps becoming involved elsewhere and thought to myself “That could never happen to me.”

Sadly, this blog has remained silent for over two years, and I haven’t done substantive development work since last September.

Still, some remain interested in Pyrseas, if gauged by the latest two issues: inquiring about consolidating/updating our documentation and porting dbtoyaml/yamltodb to other databases. So it’s appropriate that we discuss in what directions the project may turn.

Some background first.

Pyrseas: Origins

Pyrseas was born from my experience with making structural changes to SQL databases, version control systems, the typical incompatibility between the two and the solution suggested by Andromeda.

Andromeda had a fairly simple concept: describe your database using YAML and use that to drive the process of generating SQL to make structural modifications (also, use the YAML spec to produce a basic CRUD app to make content changes—see Part 3).

The Pyrseas innovation was: rather than manually edit a YAML spec, why not create it from the Postgres catalogs? In addition, instead of using the information_schema catalog views so that the process could be nominally portable to a few other SQL databases, we opted to query the internal catalogs directly.

The Imaginarium of Peter Eisentraut

In response to a post regarding another tool that diff’d database versions, Core Team member Peter Eisentraut commented that something that “doesn’t support all PostgreSQL features” is not helpful “to help you manage your database upgrades, because it might miss important details.”

That made us reconsider the scope of Pyrseas which initially was to be limited to tables, views and perhaps functions. We decided to address the vast array of Postgres features and some releases later we managed to achieve that goal, for the most part.

A post about the proper place of business logic then led to a

With one very small exception, there is nothing in the buildfarm server that requires the client to be running perl, even though both the server and the standard client are written in perl.

Andreas Scherbaum has written a new specialized client in python, and he asked me about removing that dependency. Essentially what this requires is that an alternative mechanism be allowed for transmitting the serialized configuration in the client's web transaction. Currently the client uses perl's Storable module to serialize the data, and the server uses the same module to de-serialize it. The obvious alternative candidate is to serialize it as JSON.

The first problem was to determine if we can sanely distinguish between data serialized by the two mechanisms. And we can. JSON is not allowed to contain any control characters, and a structure serialized using Storable's nfreeze() method is pretty much guaranteed to contain such characters. So I've added a test to the receiving program that looks for such characters and if it doesn't find them assumes that the data is JSON and decodes it accordingly. This has been tested using the client nightjar.

There are a few wrinkles, though. The main one is that we can't assume that every client has a module available to encode the data as JSON. In fact, nightjar didn't until I installed one. So in order to build this into the client I'll need to find a way to determine at runtime if it's available and fall back to using Storable otherwise.

We should also look at actually storing the data as JSON rather than as a serialized blob. When the buildfarm was created we didn't have any database support for JSON, but now the sensible thing to do would be to store the data as jsonb, and make it searchable.

But that's a project for another day.
Posted by Keith Fiske in OmniTI on 2016-06-08 at 21:25

As a followup to my previous post on checking for bloat, I figured I’d share some methods for actually cleaning up bloat once you find it. I’ll also be providing some updates on the script I wrote due to issues I encountered and thanks to user feedback from people that have used it already.

First, as these examples will show, the most important thing you need to clean up bloat is extra disk space. This means it is critically important to monitor your disk space usage if bloat turns out to be an issue for you. And if your database is of any reasonably large size, and you regularly do updates & deletes, bloat will be an issue at some point. I’d say a goal is to always try and stay below 75% disk usage either by archiving and/or pruning old data that’s no longer needed. Or simply adding more disk space or migrating to new hardware all together. Having less 25% free can put you in a precarious situation where you may have a whole lot of disk space you can free up, but not enough room to actually do any cleanup at all or without possibly impacting performance in big ways (Ex. You have to drop & recreate a bloated index instead of rebuilding it concurrently, making previously fast queries extremely slow).


The above graph (y-axis terabytes) shows my recent adventures in bloat cleanup after using this new scan, and validates that what it reported is actually bloat. In both this graph and the one below, there were no data purges going on and each of the significant line changes coincided exactly with bloat cleanup session. You can see back on May 26th a huge drop in size. You can see an initial tiny drop followed by a fairly big increase then the huge drop. This is me first fixing one small, but very bloated index followed by running a pg_repack to take care of both table and a lot of index bloat. This is actually the group_members table I used as the example in my previous post. Over the next week or so I worked through roughly 80 bloated objects to recover about 270GB of disk space. Now, it may turn out that some of th

Posted by Daniel Pocock on 2016-06-08 at 17:11

GSoC students have officially been coding since 23 May (about 2.5 weeks) and are almost half-way to the mid-summer evaluation (20 - 27 June). Students who haven't completed some meaningful work before that deadline don't receive payment and in such a large program, there is no possibility to give students extensions or let them try and catch up later.

Every project and every student are different, some are still getting to know their environment while others have already done enough to pass the mid-summer evaluation.

I'd like to share a few tips to help students ensure they don't inadvertently fail the mid-summer evaluation

Kill electronic distractions

As a developer of real-time communications projects, many people will find it ironic or hypocritical that this is at the top of my list.

Switch off the mobile phone or put it in silent mode so it doesn't even vibrate. Research has suggested that physically turning it off and putting it out of sight has significant benefits. Disabling the voicemail service can be an effective way of making sure no time is lost listening to a bunch of messages later. Some people may grumble at first but if they respect you, they'll get into the habit of emailing you and waiting for you to respond when you are not working.

Get out a piece of paper and make a list of all the desktop notifications on your computer, whether they are from incoming emails, social media, automatic updates, security alerts or whatever else. Then figure out how to disable them all one-by-one.

Use email to schedule fixed times for meetings with mentors. Some teams/projects also have fixed daily or weekly times for IRC chat. For a development project like GSoC, it is not necessary or productive to be constantly on call for 3 straight months.

Commit every day

Habits are a powerful thing. Successful students have a habit of making at least one commit every day. The "C" in GSoC is for Code and commits are a good way to prove that coding is taking place.

GSoC is not a job, it is like a freelance project. There is