High availability of PostgreSQL databases is incredibly important to me. You might even say it’s a special interest of mine. It’s one reason I’m both excited and saddened by a feature introduced in 9.4. I’m Excited because it’s a feature I plan to make extensive use of, and saddened because it has flown under the radar thus far. It’s not even listed in the What’s new in PostgreSQL 9.4 Wiki page. If they’ll let me, I may have to rectify that.
What is this mysterious change that has me drooling all over my keyboard? The new recovery_min_apply_delay standby server setting. In name and intent, it forces a standby server to delay application of upstream changes. The implications, however, are much, much more important.
Let me tell you a story; it’s not long, actually. A couple years ago, I had to help a client that was using a hilariously over-engineered stack to prevent data loss. I only say that because at first glance, the number of layers and duplicate servers would shock most people, and the expense would finish the job. This was one of my full recommended stacks, plus a few extra bits for the truly paranoid. DRBD-bonded servers, Pacemaker failover, off-site disaster recovery streaming clones, nightly backup, off-site backup and historical WAL storage, and long-term tape archival in a vault for up to seven years. You would need to firebomb several cities to get rid of this data.
But data permanence and availability are not synonymous. All it took was a single misbehaving CPU to take out the entire constellation of database servers, and corrupt a bunch of recent WAL files for good measure. How this is possible, and how difficult it is to avoid, is a natural extension of using live streaming replicas for availability purposes. We always need to consider one important factor: immediacy applies to everything.
Here’s what actually happened:
This week the following commit has landed in PostgreSQL code tree, introducing a new feature that will be released in 9.5:
commit: cb1ca4d800621dcae67ca6c799006de99fa4f0a5 author: Tom Lane <firstname.lastname@example.org> date: Sun, 22 Mar 2015 13:53:11 -0400 Allow foreign tables to participate in inheritance. Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. [...] Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me
As mentioned in the commit message, foreign tables can now be part of an inheritance tree, be it as a parent or as a child.
Well, seeing this commit, one word comes immediately in mind: in-core sharding. And this feature opens such possibilities with for example a parent table managing locally a partition of foreign child tables located on a set of foreign servers.
PostgreSQL offers some way to already do partitioning by using CHECK constraints (non-intuitive system but there may be improvements in a close future in this area). Now combined with the feature committed, here is a small example of how to do sharding without the need of any external plugin or tools, only postgres_fdw being needed to define foreign tables.
Now let's take the example of 3 Postgres servers, running on the same machine for simplicity, using ports 5432, 5433 and 5434. 5432 will hold a parent table, that has two child tables, the two being foreign tables, located on servers listening at 5433 and 5434. The test case is simple: a log table partitioned by year.
First on the foreign servers, let's create the child tables. Here it is for the table on server 5433:
=# CREATE TABLE log_entry_y2014(log_time timestamp, entry text, check (date(log_time) >= '2014-01-01' AND date(log_time) < '2015-01-01')); CREATE TABLE
And the second one on 5434:
=# CREATE TABLE log_entry_y2015(log_tim
On March 10th, we had our third ever pgDay for SFPUG, which was a runaway success. pgDaySF 2015 was held together with FOSS4G-NA and EclipseCon; we were especially keen to join FOSS4G because of the large number of PostGIS users attending the event. In all, around 130 DBAs, developers and geo geeks joined us for pgDay SF ... so many that the conference had to reconfigure the room to add more seating!
The day started out with Daniel Caldwell showing how to use PostGIS for offline mobile data, including a phone demo.
Ozgun Erdogan presented pg_shard with a a short demo.
Gianni Ciolli flew all the way from London to talk about using Postgres' new Logical Decoding feature for database auditing.
Peak excitement of the day was Paul Ramsey's "PostGIS Feature Frenzy" presentation.
We also had presentations by Mark Wong and Bruce Momjian, and lightning talks by several presenters. Slides for some sessions are available on the FOSS4G web site. According to FOSS4G, videos will be available sometime soon.
Of course, we couldn't have done it without our sponsors: Google, EnterpriseDB, 2ndQuadrant, CitusDB and pgExperts. So a big thank you to our sponsors, our speakers, and the staff of FOSS4G-NA for creating a great day.
Before PostgreSQL got streaming replication, back in version 9.0, people kept asking when we’re going to get replication. That was a common conversation-starter when standing at a conference booth. I don’t hear that anymore, but this dialogue still happens every now and then:
- I have streaming replication set up, with a master and
standby. How do I perform failover?
- That’s easy, just kill the old master node, and run “pg_ctl promote” on the standby.
- Cool. And how do I fail back to the old master?
- Umm, well, you have to take a new base backup from the new master, and re-build the node from scratch..
- Huh, what?!?
pg_rewind is a better answer to that. One way to think of it is that it’s like rsync on steroids. Like rsync, it copies files that differ between the source and target. The trick is in how it determines which files have changed. Rsync compares timestamps, file sizes and checksums, but pg_rewind understands the PostgreSQL file formats, and reads the WAL to get that information instead.
I started hacking on pg_rewind about a year ago, while working for VMware. I got it working, but it was a bit of a pain to maintain. Michael Paquier helped to keep it up-to-date, whenever upstream changes in PostgreSQL broke it. A big pain was that it has to scan the WAL, and understand all different WAL record types – miss even one and you might end up with a corrupt database. I made big changes to the way WAL-logging works in 9.5, to make that easier. All WAL record types now contain enough information to know what block it applies to, in a common format. That slashed the amount of code required in pg_rewind, and made it a lot easier to maintain.
I have just committed pg_rewind into the PostgreSQL git repository, and it will be included in the upcoming 9.5 version. I always intended pg_rewind to be included in PostgreSQL itself; I started it as a standalone project to be able to develop it faster, outside the PostgreSQL release cycle, so I’m glad it finally made it into the main distribution now. Please give it a l
Somehow I've gotten through 10 years of SQL without ever learning this construction, which I found while proof-reading a colleague's blog post and looked so unlikely that I had to test it before I believed it actually worked. Just goes to show, there's always something new to learn.
Suppose you have a GPS location table:
You can get a correct set of lines from this collection of points with just this SQL:
ST_MakeLine(geom ORDER BY gps_time ASC) AS geom
GROUP BY gps_track_id
Those of you who already knew about placing
BY within an aggregate function are going "duh", and the
rest of you are, like me, going "whaaaaaa?"
Prior to this, I would solve this problem by ordering all the groups in a CTE or sub-query first, and only then pass them to the aggregate make-line function. This, is, so, much, nicer.
PostgreSQL has provided table partitions for a long time. In fact, one might say it has always had partitioning. The functionality and performance of table inheritance has increased over the years, and there are innumerable arguments for using it, especially for larger tables consisting of hundreds of millions of rows. So I want to discuss a quirk that often catches developers off guard. In fact, it can render partitioning almost useless or counter-productive.
PostgreSQL has a very good overview in its partitioning documentation. And the pg_partman extension at PGXN follows the standard partitioning model to automate many of the pesky tasks for maintaining several aspects of partitioning. With modules like this, there’s no need to manually manage new partitions, constraint maintenance, or even some aspects of data movement and archival.
However, existing partition sets exist, and not everyone knows about extensions like this, or have developed in-house systems instead. Here’s something I encountered recently:
CREATE TABLE sys_order ( order_id SERIAL PRIMARY KEY, product_id INT NOT NULL, item_count INT NOT NULL, order_dt TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE TABLE sys_order_part_201502 () INHERITS (sys_order); ALTER TABLE sys_order_part_201502 ADD CONSTRAINT chk_order_part_201502 CHECK (order_dt >= '2015-02-01'::DATE AND order_dt < '2015-02-01'::DATE + INTERVAL '1 mon');
This looks innocuous enough, but PostgreSQL veterans are already shaking their heads. The documentation alludes to how this could be a problem:
Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don’t need to be visited.
The issue in this case, is that adding the interval of a month changes the right boundary of this range constraint into a dynamic value. PostgreSQL will not use dynamic values in evaluating check constraints. Here’s a query plan from PostgreSQL 9.4.1, which is the most recent release as of
The BDR team has recently introduced support for dynamically adding new nodes to a BDR group from SQL into the current development builds. Now no configuration file changes are required to add nodes and there’s no need to restart the existing or newly joining nodes.
This change does not appear in the current 0.8.0 stable release; it’ll land in 0.9.0 when that’s released, and can be found in the bdr-plugin/next branch in the mean time.
New nodes negotiate with the existing nodes for permission to join. Soon they’ll be able to the group without disrupting any DDL locking, global sequence voting, etc.
There’s also an easy node removal process so you don’t need to modify internal catalog tables and manually remove slots to drop a node anymore.
With this change, the long-standing GUC-based configuration for BDR has been removed. bdr.connections no longer exists and you no longer configure connections with bdr.[conname]_dsn etc.
Instead, node addition is accomplished with the bdr.bdr_group_join(...) function. Because this is a function in the bdr extension, you must first CREATE EXTENSION bdr;. PostgreSQL doesn’t have extension dependencies and the bdr extension requires the btree_gist extension so you’ll have to CREATE EXTENSION btree_gist first.
Creation of the first node must now be done explicitly using bdr.bdr_group_create. This promotes a standalone PostgreSQL database to a single-node BDR group, allowing other nodes to then be joined to it.
You must pass a node name and a valid externally-reachable connection string for the dsn parameter, e.g.:
CREATE EXTENSION btree_gist; CREATE EXTENSION bdr; SELECT bdr.bdr_group_join( local_node_name = 'node1', node_external_dsn := 'host=node1 dbname=mydb' );
Note that the dsn is not used by the root node its self. It’s used by other nodes to connect to the root node, so you can’t use a dsn like host=localhost dbname=mydb if you intend to have nodes on multiple machines.
You can now join other nodes to f
The 2.1.6 release of PostGIS is now available.
The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.6 release. As befits a patch release, the focus is on bugs, breakages, and performance issues. Users with large tables of points will want to priorize this patch, for substantial (~50%) disk space savings.
Last winter, we open-sourced pg_shard, a transparent sharding extension for PostgreSQL. It brought straightforward sharding capabilities to PostgreSQL, allowing tables and queries to be distributed across any number of servers.
Today we’re excited to announce the next release of pg_shard. The changes in this release include:
For more information about recent changes, you can view all the issues closed during this release cycle on GitHub.
Upgrading or installing is a breeze: see pg_shard’s GitHub page for detailed instructions.
Whether you want a distributed document store alongside your normal PostgreSQL tables or need the extra computational power afforded by a sharded cluster, pg_shard can help. We continue to grow pg_shard’s capabilities and are open to feature requests.
If you have any questions about pg_shard, please contact us using the pg_shard-users mailing list.
If you discover an issue when using pg_shard, please submit it to our issue tracker on GitHub.
A nice feature extending the usage of pgbench, in-core tool of Postgres aimed at doing benchmarks, has landed in 9.5 with this commit:
commit: 878fdcb843e087cc1cdeadc987d6ef55202ddd04 author: Robert Haas <email@example.com> date: Mon, 2 Mar 2015 14:21:41 -0500 pgbench: Add a real expression syntax to \set Previously, you could do \set variable operand1 operator operand2, but nothing more complicated. Now, you can \set variable expression, which makes it much simpler to do multi-step calculations here. This also adds support for the modulo operator (%), with the same semantics as in C. Robert Haas and Fabien Coelho, reviewed by Álvaro Herrera and Stephen Frost
pgbench has for ages support for custom input files using -f with custom variables, variables that can be set with for example \set or \setrandom, and then can be used in a custom set of SQL queries:
\set id 10 * :scale \setrandom id2 1 :id SELECT name, email FROM users WHERE id = :id; SELECT capital, country FROM world_cities WHERE id = :id2;
Up to 9.4, those custom variables can be calculated with simple rules of the type "var operator var2" (the commit message above is explicit enough), resulting in many intermediate steps and variables when doing more complicated calculations (note as well that additional operands and variables, if provided, are simply ignored after the first three ones):
\setrandom ramp 1 200 \set scale_big :scale * 10 \set min_big_scale :scale_big + :ramp SELECT :min_big_scale;
In 9.5, such cases become much easier because pgbench has been integrated with a parser for complicated expressions. In the case of what is written above, the same calculation can be done more simply with that, but far more fancy things can be done:
\setrandom ramp 1 200 \set min_big_scale :scale * 10 + :ramp SELECT :min_big_scale;
With pgbench run for a couple of transactions, here is what you could get:
$ pgbench -f test.sql -t 5 [...] $ tail -n5 $PGDATA/pg_log/postgresql.log LOG: statement: SELECT 157; LOG: statement: SELECT 53; LOG
I also met a couple of people from CoinBeyond. They are a point-of-sale software vendor that specializes in letting "regular" people (read: not I or likely the people reading this blog) use Bitcoin!
That's right folks, the hottest young currency in the market today is using the hottest middle aged technology for their database, PostgreSQL. It was great to see that they are also located in Whatcom County. The longer I am here, the more I am convinced that Whatcom County (and especially Bellingham) is a quiet tech center working on profitable ventures without the noise of places like Silicon Valley. I just keep running into people doing interesting things with technology.
Oh, for reference:
Last month, I had the chance to talk about PostgreSQL monitoring, and present some of the tools I’m working on at pgconf.ru.
This talk was a good opportunity to work on an overview of existing projects dealing with monitoring or performance, see what may be lacking and what can be done to change this situation.
Here are my slides:
If you’re interested in this topic, or if you developped a tool I missed while writing these slides (my apologies if it’s the case), the official wiki page is the place you should go first.
I’d also like to thank all the pgconf.ru staff for their work, this conference was a big success, and the biggest postgresql-centric event ever organized.
|SQL Engine Flow|
The one chart you should remember from this post is this one, GIN speedup between 9.3 and 9.4:
Interpreting this chart is a bit tricky - x-axis tracks duration on PostgreSQL 9.3 (log scale), while y-axis (linear scale) tracks relative speedup 9.4 vs. 9.3, so 1.0 means 'equal performance', and 0.5 means that 9.4 is 2x faster than 9.3.
The chart pretty much shows exponential speedup for vast majority of queries - the longer the duration on 9.3, the higher the speedup on 9.4. That's pretty awesome, IMNSHO. What exactly caused that will be discussed later (spoiler: it's thanks to GIN fastscan). Also notice that almost no queries are slower on 9.4, and those few examples are not significantly slower.
While both pgbench and TPC-DS are well established benchmarks, there's no such benchmark for testing fulltext performance (as far as I know). Luckily, I've had played with the fulltext features a while ago, implementing archie - an in-database mailing list archive.
It's still quite experimental and I use it for testing GIN/GiST related patches, but it's suitable for this benchmark too.
So I've taken the current archives of PostgreSQL mailing lists, containing about 1 million messages, loaded them into the database and then executed 33k real-world queries collected from postgresql.org. I can't publish those queries because of privacy concerns (there's no info on users, but still ...), but the queries look like this:
SELECT id FROM messages WHERE body_tsvector @@ ('optimizing & bulk & update')::tsquery ORDER BY ts_rank(body_tsvector, ('optimizing & bulk & update')::tsquery) DESC LIMIT 100;
The number of search terms varies quite a bit - the simplest queries have a single letter, the most complex ones often tens of words.
The PostgreSQL configuration was mostly default, with only minor changes:
I also submitted a talk entitled: "Suck it! Webscale is Dead; PostgreSQL is King!". This talk was submitted as a joke. I never expected it to be accepted, it hadn't been written, the abstract was submitted on the fly, improvised and in one take. Guess which talk was accepted? "Webscale is Dead; PostgreSQL is King!". They changed the first sentence of the title which is absolutely acceptable. The conference organizers know their audience best and what should be presented.
What I have since learned is that the talk submission committee was looking for dynamic talks, dynamic content, and new, inspired ideas. A lot of talks that would have been accepted in years past weren't and my attempt at humor fits the desired outcome. At first I thought they were nuts but then I primed the talk at SDPUG/PgUS PgDay @ Southern California Linux Expo.
I was the second to last presenter on Thursday. I was one hour off the plane. I was only staying the night and flying home the next morning, early. The talk was easily the best received talk I have given. The talk went long, the audience was engaged, laughter, knowledge and opinions were abound. When the talk was over, the talk was given enthusiastic applause and with a definite need for water, I left the room.
I was followed by at least 20 people, if not more. I don't know how many there were but it was more than I have ever had follow me after a talk before. I was deeply honored by the reception. One set of guys that approached me said something to the effect of: "You seem like you don't mind expressing your opinions".
This is not the first time that 2ndQuadrant has looked at Puppet. Gabriele Bartolini has already written an article in two parts on how to rapidly configure a PostgreSQL server through Puppet and Vagrant, accompanied by the release of the code used in the example on GitHub (https://github.com/2ndquadrant-it/vagrant-puppet-postgresql).
Split into three parts, the aim of this article is to demonstrate automation of the setup and configuration of Barman to backup a PostgreSQL test server.
This article is an update of what was written by Gabriele with the idea of creating two virtual machines instead of one, a PostgreSQL server and a Barman server.
it2ndq/barman is the module released by 2ndQuadrant
Italy to manage the installation of Barman through Puppet. The
module has a GPLv3 licence and is available on GitHub at the
The following procedure was written for an Ubuntu 14.04 Trusty Tahr
but can be performed in a similar manner on other
To start the module for Barman on a virtual machine, we need the following software:
Vagrant is a virtual machine manager, capable of supporting many virtualisation softwares with VirtualBox as its default.
We install VirtualBox this way:
$ sudo apt-get install virtualbox virtualbox-dkms
The latest version of Vagrant can be downloaded from the site and installed with the command:
$ sudo dpkg -i /path/to/vagrant_1.7.2_x86_64.deb
Regarding Ruby, our advice is to use
creates a Ruby development environment in which to specify the
version for the current user, thereby avoiding contaminating the
system environment. To install
rbenv we suggest to use
Let’s download and execute the script:
$ curl https://raw.githubusercontent.com/fesplugas/rbenv-installer/master/bin/rbenv-installer | bash
At the end, the script will prompt you to append
I was lucky to participate as a speaker to the Nordic PostgreSQL Day 2015 and it's been another awesome edition of the conference. Really smooth, everything has been running as it should, with about one hundred people at the conference.
The Nordic pgDay is such a successful conference that I long wanted to have just the same in my area, and so we made pgDay Paris and modeled it against Nordic. It's planned to be all the same, just with a different audience given the location.
The pgDay Paris welcomes English Speaking speakers and the Call for Papers is now open, so please consider submitting your talk proposal for pgDay Paris by cliking on the link and filling in a form. Not too hard a requirement for being allowed to visit such a nice city as Paris, really!
Though it is a rare occurrence, we have had occasions where we
need to purge ALL data from a table. Our preferred is the
TRUNCATE TABLE approach because it's orders of
magnitude faster than the
DELETE FROM construct. You
however can't use
TRUNCATE TABLE unqualified, if the
table you are truncating has foreign key references from other
tables. In comes its extended form, the
TRUNCATE TABLE ..
CASCADE construct which was introduced in PostgreSQL 8.2,
which will not only delete all data from the main table, but will
CASCADE to all the referenced tables.