The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at

Postgres 9.5 is bringing a change in the way WAL is archived with the following commit:

commit: de7688442f5aaa03da60416a6aa3474738718803
author: Heikki Linnakangas <>
date: Fri, 8 May 2015 21:59:01 +0300
At promotion, archive last segment from old timeline with .partial suffix.

Previously, we would archive the possible-incomplete WAL segment with its
normal filename, but that causes trouble if the server owning that timeline
is still running, and tries to archive the same segment later. It's not nice
for the standby to trip up the master's archival like that. And it's pretty
confusing, anyway, to have an incomplete segment in the archive that's
indistinguishable from a normal, complete segment.


As mentioned in the commit log above, prior to 9.5, a standby would always try to archive at promotion the last, partial WAL segment of the old timeline it was recovering on. This is a behavior that has been present in Postgres for ages, and there were no easy way to make a difference between a segment completely full and one only partially completed.

The data of this last partial segment is available on the segment file of the new timeline for the standby, but its name does not match the one of the old timeline as it uses as prefix the new timeline standby has been promoted on, and it contains data of the new timeline as well. Actually having it is useful when recovering on the old timeline of the master.

Note as well that the pre-9.5 behavior can cause conflicts particularly in the case where a master and its standby(s) point to the same archive location as master would try to archive a complete segment once it is done with it, and standby would archive a partial one with exactly the same name. Advanced users are normally (hopefully) using archiving scripts more advanced than a single copy command, so they may have some internal handling regarding such conflicts enabling them to save both files and make a clear difference from which node the segment has been archived, still it is an

[continue reading]

Posted by Vilis Lacis on 2015-05-28 at 07:50:00

SQL Performance Explained: Everything Developers Need to Know about SQL Performance

An in-depth book on how to improve database performance. The focus is on relational databases and it covers all major SQL databases without getting lost in the details of any one specific product. Starting with the basics of indexing and the WHERE clause, SQL Performance Explained guides developers through all parts of an SQL statement and explains the pitfalls of object-relational mapping (ORM) tools like Hibernate.

PostgreSQL 9.0 High Performance

An excellent book for intermediate to advanced PostgreSQL database administrators (DBA). Teaches everything about building, monitoring and maintaining a PostgreSQL installation while also providing useful and interesting information about database internals. If you truly want to understand how PostgreSQL operates and behaves under a high load this is a book for you.

High Performance MySQL: Optimization, Backups, and Replication

Advanced techniques for everything from designing schemas, indexes, and queries to tuning your MySQL server, operating system, and hardware to their fullest potential. This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover.
The PostgreSQL Project released an update on May 22, and the only really important fix in that update was a patch to prevent data corruption due to "Multixact member wraparound".  However,  it turns out that there are some issues with that update release, including one with file permissions, and potentially another one with Multixact truncation.  As a result, users are feeling justifiably reluctant to apply our last update release.

If there's a lesson for the project in this, it's "don't wrap an update release the same week as Feature Freeze".

Anyway, given this it would be good for users to determine more definitively if they are in actual danger of Multixact member wraparound (MMW hereafter), so that they know if they need to apply the update right away despite issues.  Here's how to do that:
  1. navigate to the PostgreSQL data directory on your server(s), e.g. "/var/lib/postgresql/9.3/main"
  2. switch to the subdirectory "pg_multixact/members" 
  3. count the number of files in this directory, e.g. "ls -l | wc -l"
If you have a few dozen, hundred, or even a few thousand multixact member files, you're fine.  If you have over 10,000, and that number increases rapidly, then you are potentially in danger of MMW, and should apply the update sooner rather than later. 

Of course, there are other fixes in this latest update, and if one of them specifically affects you, you may have updated already.

Thank you to Thomas Munro for pointing out the simple way to determine this and giving me background material on the fixes.

Note: the above is my personal advice, and has not been approved by the PostgreSQL project, core team, or PostgreSQL Experts Inc. The PostgreSQL project generally advises applying all updates promptly.
This blog is about compiling pg_repack extension on binary format of PostgreSQL installation. Most of you know, pg_repack extension is one of well known PostgreSQL extensions, its especially used for reclaiming space[bloats] ONLINE without holding an EXCLUSIVE LOCK on Tables/Indexes.

To enable pg_repack extension in PostgreSQL database, it should be compiled from sources. Its quite easy and simple to compile from source on any installed variants(source,rpm,binary) of PostgreSQL, however its slightly different if it is with binary format of PostgreSQL [One Click Installer] as they are pre-built binary bundle with dependency libraries. Let's compile and see.

On CentOS 7 Virtual Machine, I have installed binary format of PostgreSQL 9.4(download link) and its home directory "/opt/PostgreSQL/9.4/". Next we need to download pg_repack source from their official site.
[root@localhost ~]# git clone
Before compiling, pg_config of PostgreSQL 9.4 should be set in the PATH.
[root@localhost pg_repack]# export PATH=/opt/PostgreSQL/9.4/bin:$PATH
[root@localhost pg_repack]# type pg_config
pg_config is /opt/PostgreSQL/9.4/bin/pg_config
Now we are good to execute source installation commands "make" & "make install". Let's execute "make"
[root@localhost ~]# cd pg_repack/
[root@localhost pg_repack]# make
make[1]: Entering directory `/root/pg_repack/bin'
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 pg_repack.o pgut/pgut.o pgut/pgut-fe.o -L/opt/PostgreSQL/9.4/lib -lpq -L/opt/PostgreSQL/9.4/lib -L/opt/local/Current/lib -Wl,--as-needed -Wl,-rpath,'/opt/PostgreSQL/9.4/lib',--enable-new-dtags -lpgcommon -lpgport -lssl -lcrypto -lgssapi_krb5 -lz -ledit -lrt -lcrypt -ldl -lm -o pg_repack
/usr/bin/ld: cannot find -ledit
collect2: ld returned 1 exit status
make[1]: *** [pg_repack] Error 1
make[1]: Leaving directory `/root/pg_repack/bin'
make: *** [all] Error 2
Oops 1...!!, se

[continue reading]

PostgreSQL 9.4 extends the SQL standard by inserting two new clauses that facilitate many operations required during the development of applications: the WITHIN GROUP and FILTER clauses.



The WITHIN GROUP clause is particularly useful when performing aggregations on ordered subsets of data.

PostgreSQL introduced window functions since version 9.0 in order to work on subsets of data that can be correlated to each current record of tables, defining a sort of “aggregates” centred on any specific record as the query is gradually executed via the SQL OVER(PARTITION BY/ORDER BY) clause and by using the functions that can be performed on those aggregations.

With version 9.4 of PostgreSQL the SQL WITHIN GROUP clause was introduced: this simplified many operations that had previously only been possible with the use of the window functions, defining aggregations of ordered subsets of data.
In addition, new functions were introduced that can be applied to these subsets and expand the collection of available window

  • percentile_cont(), percentile_disc() for the calculation of
  • mode() a statistical function that calculates the mode on ordered
  • rank(), dense_rank(), percent_rank(), cume_dist(): window functions already available in PostgreSQL to be executed on the subsets obtained using the OVER (PARTITION BY/ORDER BY) clause and now able to take as a parameter ordered subsets produced by the WITHIN GROUP clause.

To better clarify the situation, let’s say, for example, that we want to calculate the 25th, the 50th, the 75th and the 100th percentile of the first 20 integers. Until now, this was only possible by dividing the numbers into 4 sets via the OVER (PARTITION BY/ORDER BY) clause, then order them internally into 4 ordered subsets from which to then take the maximum value – for example, by using a CTE:

$ CREATE TABLE t AS SELECT generate_series(1,20) AS val;

$ WITH subset AS (
    SELECT val,
       ntile(4) OVER (ORDER BY val) AS tile
    FROM t

[continue reading]

On 16th of May, Andres Freund committed patch: Support GROUPING SETS, CUBE and ROLLUP.   This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL.   This could previously be achieved by doing each grouping […]

Things are shaping up nicely in PostGIS 2.2 development. We are going to hit feature freeze around June 30th 2015, and plan to ship late August or early September to be in line with PostgreSQL 9.5 release. So far we have committed a couple of neat features most itemized in PostGIS 2.2 New Functions. Many of the really sort after ones will require PostgreSQL 9.5 and GEOS 3.5. The geography measurement enhancements will require Proj 4.9.0+ to take advantage of. Things I'd like to highlight and then later dedicate full-length articles in our BostonGIS Waiting for PostGIS 2.2 series once they've been stress tested.

Continue reading "PostGIS 2.2 leveraging power of PostgreSQL 9.5"
On 15th of May, Simon Riggs committed patch: TABLESAMPLE, SQL Standard and extensible   Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use […]

For the second of my ten part series on hidden PGDB (PostgreSQL) performance killers, I’m going to talk about something called an anti-join. It’s not a well-known approach outside of the database world, but due to how it works, it can impart almost magical plan revisions that drastically improve query performance in the right scenario. Developers can add it to a growing bag of tricks when working on database-driven content, since it comes in handy more often than you might expect.

Let’s build a test-case, shall we?

CREATE TABLE sys_product
    product_id   SERIAL  PRIMARY KEY,
    prod_name    TEXT    NOT NULL,
    quality      INT     NOT NULL,
    descr        TEXT    NOT NULL DEFAULT now()
INSERT INTO sys_product (prod_name, quality, descr)
SELECT 'Product ' ||,
       log(( % 100) + 1)::INT,
       'It does stuff.'
  FROM generate_series(1, 100000) a(id);
CREATE TABLE sys_order
    order_id     SERIAL       NOT NULL,
    product_id   INT          NOT NULL,
    item_count   INT          NOT NULL,
    order_dt     TIMESTAMPTZ  NOT NULL DEFAULT now(),
    valid_dt     TIMESTAMPTZ  NULL
INSERT INTO sys_order (product_id, item_count, order_dt, valid_dt)
SELECT ( % 100000) + 1, ( % 100) + 1,
       now() - (id % 1000 || 'd')::INTERVAL,
       CASE WHEN % 499 = 0
            THEN NULL
            ELSE now() - (id % 999 || 'd')::INTERVAL
  FROM generate_series(1, 1000000) a(id);
ALTER TABLE sys_order ADD CONSTRAINT pk_order_order_id
      PRIMARY KEY (order_id);
CREATE INDEX idx_order_product_id
    ON sys_order (product_id);
CREATE INDEX idx_order_valid_dt
    ON sys_order (valid_dt);
ANALYZE sys_product;
ANALYZE sys_order;

This is a very basic product and order table structure, and we’ve used it before in the last installment of this series. The only columns we’ve added since last time is the quality column in sys_product and the valid_dt column in sys_order. This way, we can introduce some variability into the query plan due to data correlatio

[continue reading]

Posted by Marco Slot in CitusData on 2015-05-22 at 12:03:14

The pg_shard extension helps you scale out PostgreSQL for large data sets by transparently distributing tables across many nodes, storing (replicated) shards of the data as regular PostgreSQL tables on worker nodes. With pg_shard, you can perform INSERT, UPDATE, DELETE, SELECT on a distributed table as if you were using a regular table.

While first-class COPY support is still on the horizon, we've already added a copy_to_distributed_table tool for loading data files, which supports the same formats as COPY. When using this tool, pg_shard creates a temporary 'insert proxy' table which has a trigger that INSERTs into the distributed table. The copy_to_distributed_table script copies a file into the insert proxy.

Below we give an example of loading customer reviews data using the example from the pg_shard github page and the copy_to_distributed_table script that comes with pg_shard.

copy_to_distributed_table -C customer_reviews_1998.csv customer_reviews

Some users have noticed that pg_shard performs poorly when performing many INSERTs over a single connection and the same applies to copy_to_distributed_table. The reason is simple: pg_shard has to wait for a network round-trip on every INSERT. Since there is no notion of parallelism within a single PostgreSQL connection, consecutive INSERTs on the same connection will have to wait. However, pg_shard can still handle more INSERTs on other connections while waiting.

When loading a large data file, it's therefore recommended to split the file into multiple chunks first and use xargs -P 64 to load the chunks concurrently. The commands below have the same effect as the command above, but run much faster.

mkdir chunks
split -n l/64 customer_reviews_1998.csv chunks/
find chunks/ -type f | xargs -n 1 -P 64 sh -c 'echo $0 `copy_to_distributed_table -C $0 customer_reviews`'

On a c3.8xlarge EC2 instance, using parallel data loading for pg_shard is roughly 40x faster and inserts around 50,000 rows per second in the above example, but there are many ways of scaling out pg_sh

[continue reading]

On 14th of May, Stephen Frost committed patch: Add pg_audit, an auditing extension   This extension provides detailed logging classes, ability to control logging at a per-object level, and includes fully-qualified object names for logged statements (DML and DDL) in independent fields of the log output.   Authors: Ian Barwick, Abhijit Menon-Sen, David Steele Reviews […]
Posted by Feng Tian on 2015-05-21 at 23:33:00
Say you have carefully tuned your database and wow! it is ten times faster -- but, before you push it to the production cluster, how do you know the answer of the query is correct?   Or at least, same as before?   :-)    You may have same question if you want to upgrade or migrate your database, or, for us, we want to make sure the new great join algorithm actually produce correct answer.   Well, let's just put the result before, and after, in two tables, and compare the two tables.

This is great, but turns out to be quite tricky.   You need to take care of the cases that,
  •  Either table may not have a primary key
  •  Rows in the two tables may be physically stored in different orders.
  • Columns could contain nulls. 
  • The two tables could be huge, so performance must be good.
Here are something you could do.   

Method 1: You can dump the tables to text or csv files, the run diff -- but you need to take care of ordering, so you have to copy the table out with order by clause.    And good luck diff-ing 2TB of text file.

Method 2: Use except.  How about this query?  
select * from (
    select * from ta except select * from tb 
    union all 
    select * from tb except select * from ta
) foo; 

It is totally wrong!   See attached sql script for some surprise.   And this?

select * from (
    select * from (select * from ta except select * from tb) xxx
    union all
    select * from (select * from tb except select * from ta) yyy
) foo;
Better, but it did not take care of duplicate rows -- and, checking select count(*) is not good enough.

Method 3: Join by ourselves!  Here is a script from my friend CK.

A as ( 
    select hashtext(textin(record_out(ta))) as h, count(*) as c 
    from ta group by h
B as (
    select hashtext(textin(record_out(tb))) as h, count(*) as c 
    from tb group by h
select * from A full outer join B on (A.h + A.c= B.h + B.c)
where A.h is null or B.h is null limit 5;

The hashtext part is optional -- for very wide rows, using hashtext will save some cycles and memory.   This is

[continue reading]

Posted by Joshua Drake in CommandPrompt on 2015-05-21 at 17:48:32
I spent a great deal of time working through the SQL DUMP portion of the 9.5devel docs this past week. Below is the current text of what I have and it would be great if my readers would take a look and offer some thoughtful feedback. What would you like to see added? What would you like to see changed? Please note that this is reference documentation not tutorial documentation.

This is just the straight HTML dump that is generated from Docbook but since it is inline the links won't work. <!a href="">The current -devel docs are here and the updated version I am working is below:

<!a name="BACKUP-DUMP" id="BACKUP-DUMP">24.1. SQL Dump

PostgreSQL provides the program <!a href="app-pgdump.html">pg_dump for generating a backup file with SQL commands that, when fed back to the server, will recreate the database in the same state as it was at the time of the dump. The basic usage of <!a href="app-pgdump.html">pg_dump is:

pg_dump -C -F p -f outfile dbname

The use of -C ensures that the dump file will contain the requisite <!a href="sql-createdatabase.html">CREATE DATABASE command within the dump file. The use of -Fp ensures that you are using the plain text format and the use of -f allows you to specify the name of the file the dump will be written to. It is also possible for pg_dump to create files in other formats that allow for parallelism and fine-grained control of object backup or restoration. For more details on all options available to pg_dump please refer to the <!a href="app-pgdump.html">pg_dump reference page.

The pg_dump application requires read access to all objects within the database that it will be operating with. This generally requires database super-user access. It is possible for any database user to use pg_dump to backup the objects that they own regardless of super-user access. This can be achieved using options such as -n schema or -t table.

The primary advantage of using pg_dump over the other backup methods described is that pg_dump

[continue reading]

In this entry, I would like to explain how to use pt-table-usage and pt-index-usage commands in the Postgres Toolkit which I introduced in the previous entry. If you have never heard about Postgres Toolkit, please visit the previous entry. A Hacker's Diary: Postgres Toolkit 0.2 released pt-table-usage and pt-index-usage are the commands which can be used to obtain usages of the tables/indexes
On 15th of May, Peter Eisentraut committed patch: Add pg_settings.pending_restart column   with input from David G. Johnston, Robert Haas, Michael Paquier This is, for me, great. It might not sound like much, but it is actually something that I'm very excited about. Basically – let's assume you change system settings. Either by modifying postgresql.conf, […]
Posted by Andrew Dunstan in pgExperts on 2015-05-20 at 16:47:00
There has been some debate going on on the hackers list about problems certain drivers have with operators that contain the ? character. Since drivers such as the JDBC driver consider any such character outside of quote marks to be a parameter placeholder, its use as part of an operator makes life difficult for them. Part of the problem has been that we allow almost every available non-alphanumeric ASCII character to be a part of an operator. Why, for example, do we allow `? I've never seen it that I can recall, but the fact means we can't use it for something else (such as marking an operator). That makes doing anything about the problem hard. Some people want to deprecate the use of ? in operators. I think it's far too late for that - its use is far too widespread.
On 12th of May, Andrew Dunstan committed patch: Additional functions and operators for jsonb   jsonb_pretty(jsonb) produces nicely indented json output. jsonb || jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text[] removes a key […]

A couple of days back the following commit has landed in the Postgres world, for the upcoming 9.5 release:

commit: a486e35706eaea17e27e5fa0a2de6bc98546de1e
author: Peter Eisentraut <>
date: Thu, 14 May 2015 20:08:51 -0400
Add pg_settings.pending_restart column

with input from David G. Johnston, Robert Haas, Michael Paquier

Particularly useful for system doing a lot of server parameter updates, this allows tracking parameters in need of a server restart when their value is updated to have the new value take effect on the system. Note that this applies to all the parameters marked as PGC_POSTMASTER in guc.c, shared_buffers being one, as well as the custom parameters a system may have after their load by a plugin. This information is tracked by a new column called pending_restart in the system view pg_settings with a boolean value set to "true" if a given GUC parameter is indeed waiting for a server restart.

In order to make visible the fact that parameter waits for a restart, the server can have its parameters be reloaded with either pg_reload_conf(), "pg_ctl reload" or a SIGHUP signal. Of course, modifications made in postgresql.conf, as well as any configuration files included, or ALTER SYSTEM are taken into account. See for example:

=# \! echo "port = 6666" > $PGDATA/postgresql.conf
=# ALTER SYSTEM SET shared_buffers TO '1GB';
=# SELECT pg_reload_conf();
(1 row)
=# SELECT name FROM pg_settings WHERE pending_restart;
(2 rows)

This will prove to be useful for many systems around, like those doing automatic tuning of system parameters or even containers (not limited to it of course).

Posted by Giulio Calacoci in 2ndQuadrant on 2015-05-19 at 08:30:45

The release of PostgreSQL 9.5 is imminent so the time has come to analyse what’s new in this latest version.

A very interesting feature of version 9.5 is the ability to import a schema from a remote database, using Foreign Data Wrapper and the IMPORT FOREIGN SCHEMA command.


Foreign Data Wrappers (FDW)

Before the introduction of Foreign Data Wrappers, the only way to connect a Postgres database with an external data source was using the dblink module.

In 2003 the set of rules for the standard management of external data sources was defined within the SQL language: SQL/MED  (management of external Data).

PostgreSQL 9.1 introduced a first implementation of the standard SQL/MED with the Foreign Data Wrappers, which provided Postgres with direct access to data sources such as files or other databases (Oracle, Mysql…), allowing their use as tables.

The advantage of this approach is obvious: it gives you the ability to connect to an external data source to extract data natively by running a simple query. The fact that it is not necessary to use external modules to obtain this result considerably simplifies the DBAs’ work.

If you want to know more, take a look at our 2011 blog post: PostgreSQL 9.1: Tabelle esterne con SQL/MED (Warning: blog post in Italian).

Small example of an FDW

PostgreSQL 9.3 introduces the support for Foreign Data Wrappers in writing (Warning: blog post in Italian) and also added support to the foreign data wrapper for PostgreSQL. Let’s have a look at a simple example of how to use an FDW by connecting together two Postgres databases.

First we create two databases:

CREATE DATABASE destination;

Within the source we create a test table with test data:

\c source
CREATE TABLE test1 AS SELECT id, md5(random()::text) FROM generate_series(1,5) id;

Now we connect to the destination db and then, connect the two databases:

\c destination
CREATE EXTENSION postgres_fdw ;
CREATE SERVER src_srv FOREIGN DATA WRAPPER postgres_fdw OPTIONS( dbname 'source' );

[continue reading]

Tom Lane introduced new state for TOAST types - expanded state. The side effect of this change together with some changes in plpgsql internals has big impact to speed of array operation in plpgsql.
I have two plpgsql block with cycles. First enforces a array fields update, second a array append.
DO $$ 
DECLARE a int[] = array_fill(10, ARRAY[10000]);
FOR i IN 1 .. 10000 LOOP
a[i] := 0;

DO $$
DECLARE a int[] = '{}';
FOR i IN 1 .. 10000 LOOP
a := a || 10;
You can try this code on PostgreSQL 9.4 - and you can get time 450 and 220ms. Same code needs 6 and 5 ms only on PostgreSQL 9.5! It is more than one order speedup.
Posted by gabrielle roth on 2015-05-19 at 02:07:00
I do a lot of “is this !#@(* really working correctly” testing. (See also: simple test for autovacuum log messages.) Here’s a simple test to verify that log_lock_waits is configured correctly. This is essential for troubleshooting lock contention. -- verify settings -- log_lock_waits should be 'on'; it is off by default. SHOW log_lock_waits; -- lock_wait […]
Posted by Greg Sabino Mullane in End Point on 2015-05-18 at 12:00:00

Determining if your current database connection is using PgBouncer, or going directly to Postgres itself, can be challenging, as PgBouncer is a very low-level, transparent interface. It is possible, and here are some detection methods you can use.

This was inspired by someone asking on the Perl DBD IRC channel if it was possible to easily tell if your current database handle (usually "$dbh") is connected to PgBouncer or not. Since I've seen this question asked in other venues, I decided to take a crack at it.

There are actually two questions to be answered: (1) are we connected to PgBouncer, and if so, (2) what pool_mode is being run? The quickest and easiest wat I found to answer the first question is to try and connect to a non-existent database. Normally, this is a FATAL message, as seen here:

$ psql testdb -p 5432
testdb=# \c ghostdb
FATAL:  database "ghostdb" does not exist
Previous connection kept

However, a slightly different ERROR message is returned if the same thing is attempted while connected to PgBouncer:

$ psql testdb -p 6432
testdb=# \c ghostdb
ERROR:  No such database: ghostdb
Previous connection kept

Thus, an ERROR will always indicate that you are connected to PgBouncer and not directly to Postgres, which will always issue a FATAL.

In the future, there will be an even simpler method. As of this writing, pgBouncer 1.6 has not been released, but it will have the ability to customize the application_name. This is a configurable session-level variable that is fairly new in Postgres. Andrew Dunstan wrote a patch which enables adding this to your pgbouncer.ini file:

application_name_add_host = 1

This will make PgBouncer modify the application_name to append some information to it such as the remote host, the remote port, and the local port. This is a feature many PgBouncer users will appreciate, as it offers an escape from the black hole of connection information that PgBouncer suffers from. Here is what it looks like on both a normal Postgres conne

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2015-05-17 at 17:20:00
One of the things I like about data recovery jobs is that each one is different. Right now I'm wrestling with this problem: how can I recover whatever is possible from a backup that's missing a handful of xlog files. We know this will result in a probably corrupt database, but we still want to see what data is available.

My first two approaches didn't work too well. I tried mangling the control file, and creating dummy xlog files to replace the handful that are missing.

Right now I'm trying to follow a suggestion from Andres Freund to drive the process via the debugger, moving it on to the next available log file when it gets to where it would want the first missing log file.  The trouble is it's hard to find what to set exactly and where to set it (this is Postgres 9.3). Every time I try something I get errors like "unexpected page_addr". If I were more familiar with this section of the code it might be easier, but it's something I've pretty much never dabbled in.

It all makes for an exciting life.
Posted by Robins Tharakan on 2015-05-16 at 14:50:00
While reviewing an existing application, I thought it'd be worthwhile to review how good / bad PostgreSQL is in terms of OLAP. This (growing) post is going to be my (un)learning of how ready is PostgreSQL. Row Numbering Support: Yes.  Use: Row_Number() function numbers rows generated in a result-set. Example: SELECT   row_number() OVER (ORDER BY marks DESC) AS rn,  nameFROM x; Review: Some
Finally ! A *much* awaited feature, this attempt at adding the GROUPING SETS / ROLLUP / CUBE feature to PostgreSQL has been in the works for about a year (besides the so many in the past decade and a half that didn't get through), and thankfully this has finally got the approval of the powers that be, so the upcoming Postgres 9.5 would finally have this long pending SQL feature. MSSQL and
Posted by Joel Jacobson on 2015-05-16 at 10:08:21

By coincidence I stumbled upon a table where one of its columns didn’t have a foreign key. I found this strange, since our convention is to always add foreign keys when appropriate.

But humans are humans, and will eventually forget to add a foreign key, so we better have both belt and suspenders and find a way to deal with this inevitable problem in an efficient way.

It would be a tedious job to manually look for missing foreign keys in all tables and columns.
But if you’re lucky enough to have the pleasure of working with a system where all tables and columns have been given their names by following a strict naming convention, it might be possible to fully automate the task.

This is the namning convention we use:

  1. Table names are always in plural form with a tailing “s”, e.g. users
  2. Primary key column names are always equal to the table name in singular form with a tailing id, e.g. userid.
  3. Foreign key columns are always equal to the primary key they are referecning, e.g. transactions.userid -> users.userid

This means you always know based on the column name alone, what table if any that column might be referencing.

Example: If table transactions have a column named userid, and there is a table where userid is also the primary key, but if there isn’t any foreign key on transactions.userid, then it’s a missing foreign key, or else someone has not followed the namning convention.

Thanks to PostgreSQL’s fantastic pg_catalog system tables, we can write a query which uses the rules from the naming convention and returns all the columns which appears to be missing foreign keys. It doesn’t support multi-column keys, but we don’t have many of those, so it’s not a problem in my case.

Thanks to the view below, I automatically found three more missing foreign keys of the same type, which saves me many hours of boring work today.

pg1:joel=#* SELECT * FROM view_missing_foreign_keys;
 nspname |      relname    | attname 
 public  | transactions    | userid
 public  | someohtertable1 | use

[continue reading]

Yesterday I gave the developers at my company what I call a DBA Chat. It’s something I try to do every month to keep them apprised on various features, caveats, performance considerations, and so on. I find that educating the folks who regularly work with the database does wonders for application performance and my sanity. The benefit of this long format is that I can go over more information than a time constrained set of slides.

This month we went over one of my slightly older talks, and even though the material was three years old from my perspective, it was all new to them. And why not? Developers have a job to do, and while they do work with the database, it generally isn’t their responsibility to research the voluminous minutia and quirks commonly associated with a specific platform. That’s my job.

So here is the first of a ten part series on anti-patterns that can annihilate PGDB (PostgreSQL) performance, some of which are outright insidious. This week we’ll start slow with a short discussion on the importance of EXPLAIN. This is PGDB’s first line of defense against bad queries, and it’s overlooked more than you might expect, as well as somewhat difficult to interpret for the uninformed. I’ll probably have a more in-depth version of this article in the future for truly interpreting EXPLAIN output, but this time we’ll be focusing on using it in general.

First, we need to set up a use case to illustrate a couple of variants. Two tables with a close relationship should do it:

CREATE TABLE sys_product
    product_id   SERIAL  PRIMARY KEY,
    prod_name    TEXT    NOT NULL,
    descr        TEXT    NOT NULL DEFAULT now()
INSERT INTO sys_product (prod_name, descr)
SELECT 'Product ' ||, 'It does stuff.'
  FROM generate_series(1, 1000) a(id);
CREATE TABLE sys_order
    order_id     SERIAL       NOT NULL,
    product_id   INT          NOT NULL,
    item_count   INT          NOT NULL,
    order_dt     TIMESTAMPTZ  NOT NULL DEFAULT now()
INSERT INTO sys_order (product_id, item_count)

[continue reading]

Posted by Josh Berkus on 2015-05-15 at 19:13:28

We're having two meetups in May: one in Oakland, and one in San Francisco:

Unfortunately, the Oakland one already has a waiting list, but get on it if you're interested in case of cancellations. And a big thank-you to and for sponsoring these meetups.

Posted by Andrew Dunstan in pgExperts on 2015-05-15 at 13:04:00
Every so often I've wanted a SET metatype in Postgres.

We do have bit fields, which can be used as sets with a little work, and this is how MySQL does it's SET types as I understand it. But what if you want a set of enums (our enums can have very large lists of values), or a set of text values, or of integers? Bit fields won't work for these - we would need some sort of sparse representation.

One obvious way would be to store the set as a one-dimensional array, kept in sorted order for efficient searching. For text values, we could probably use some smart form of sorting such as is used by jsonb and hstore to store object keys, to reduce the need for string comparisons.

Obviously, I have done no work on this, but it seems like it would be a worthwhile project.

In the meantime, you can use hstore more or less in this way. Just treat it as a mapping from whatever values you have to 1.  It's a bit of a hack, but it should work. You'll just need to cast your values to text.
Posted by Andrew Dunstan in pgExperts on 2015-05-13 at 16:32:00
Yesterday I committed some new features for jsonb. These are:
  • jsonb_pretty(jsonb) produces nicely indented json output.
  • jsonb || jsonb concatenates two jsonb values.
  • jsonb - text removes a key and its associated value, or a matching array element, from the json
  • jsonb - int removes the designated array element
  • jsonb - text[] removes a key and associated value or array element at the designated path
  • jsonb_replace(jsonb,text[],jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value.
This was based Dmitry Dolgov's jsonbx extension for 9.4, which in turn was based on earlier work by Oleg Bartunov, Teodor Sigaev and Alexander Korotkov on nested hstore. Petr Jelinek also did some useful review and tidy up work on the core features. So, a good team effort.