The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at

The reason why I do benchmarks is that, in my experience, many recommended settings are effectively myths. Many are perfectly fine, but a surprisingly large part either was never quite true or is based on data that got obsolete for various reasons (e.g. changes in hardware or software). So the verification of the facts is rather imporant to make good decisions, which is why I've been recently benchmarking file systems available on Linux, briefly discussed in the posts on EXT4 vs. XFS, BTRFS and Reiser.

In this post, I'd like to briefly look at a potential myth when it comes to databases and I/O schedulers. The purpose of the scheduler is to decide in what order to submit I/O requests (received from the user) to the hardware - it may decide to reorder them, coalesce requests into larger continuous chunks, prioritize requests from some processes etc.

Linux currently (kernel 4.0.4) offers three schedulers (there used to be anticipatory scheduler, but it got removed in kernel 2.6.33):

  • cfq (default) - attempts to do "completely fair queueing" for various processes, with the possibility to lower/increase priority of some processes using ionice, and also applies a fair amount of magic (reordering, coalescing etc.)
  • noop - probably the simplest scheduler, pretty much just a FIFO with trivial merging of requests
  • deadline - attempts to guarantee start time of serving the request, prioritises reads over writes and performs various other things

I'm not going to discuss cfq here - it's a fairly complex scheduler attempting to reorder the requests in various ways, support priorities, different handling for synchronous and asynchronous requests and such. Instead, let's look at the simpler scheduler and why they might work better in some cases.


The usual wisdom used to be that for databases, deadline scheduler works best. The reasoning behind this recommendation is quite nicely explained for example here, but let me quote the main part:

This tends to reduce the merging opportunity for reads as they are dispatched q

[continue reading]

One of the upcoming features in PostgreSQL 9.5 is foreign table inheritance, which offers a suite of new possibilities for users of foreign data wrappers. An exciting possibility is that you will be able to create a single table in which recent data is stored in a row-based format for fast appending and look-ups, while older data is stored in a columnar format for compression and fast analytics using cstore_fdw

An example use-case might be a news organization which has frequent look-ups of recent data (e.g. to generate news feeds), while older data is used for analytics (e.g., find correlations in events). We use the gdelt dataset containing hundreds of millions of news events as a reference. This use-case will benefit greatly from using row-based storage for recent data, and columnar storage for older data, so let's see how we will be able to do this in PostgreSQL 9.5. The current release of cstore_fdw is not yet 9.5 compatible, but a preview is available on the 9.5-compatible branch.

To get started, we can create the events table and the row-based events_current table that inherits from it (see full source). When querying the events table, postgres includes inherited tables in the results.

    event_date int,
CREATE TABLE events_current () INHERITS(events);

We also set up a trigger that routes INSERTs on the events table to the events_current table.

AS $$
    INSERT INTO events_current VALUES (NEW.*);
$$ LANGUAGE plpgsql;

FOR EACH ROW EXECUTE PROCEDURE events_insert_trigger();

We can now start adding new data to our event table, which will use row-based storage. The gdelt dataset comprises over 100GB of news events in easy-to-digest CSV format. It can be copied into the events table using:

\COPY events FROM '20150101.export.CSV' WITH (NULL '')

After a while we can migrate older data to a cstore_fdw partition. We first cr

[continue reading]

Posted by Shaun M. Thomas on 2015-10-02 at 21:33:10

I’ve been talking about partitions a lot recently, and I’ve painted them in a very positive light. Postgres partitions are a great way to distribute data along a logical grouping and work best when data is addressed in a fairly isloated manner. But what happens if we direct a basic query at a partitioned table in such a way that we ignore the allocation scheme? Well, what happens isn’t pretty. Let’s explore in more detail.

Let’s use the fancy partitioning I introduced a while back. Using this structure, I ran a bunch of tests with a slight modification to my python script. I changed the reading_date granularity to daily, and had the script itself create and destroy the partitions. This allowed me to run several iterations and then meter the performance of queries on the resulting output.

The relevant python chunk looks like this:

end_stamp = stamp + timedelta(days = 1)
part = ''
if j > 0:
    part = '_part_%d%02d%02d' % (stamp.year, stamp.month,
    cur.execute("DROP TABLE IF EXISTS sensor_log%s" % part)
      "CREATE TABLE sensor_log%s (" % part +
      "LIKE sensor_log INCLUDING ALL," +
      "  CHECK (reading_date >= '%s' AND" % +
      "    reading_date < '%s')" % +
      ") INHERITS (sensor_log)"

With that, the script will assume daily granularity and build the check constraints properly so constraint exclusion works as expected. To get an idea of how partitions scale with basic queries, I ran the script with 10, 20, 50, 100, 150, and 250 days. This gave a wide distribution of partition counts and the numbers I got made it fairly clear what kind of drawbacks exist.

All tests on the partition sets used these two queries:

-- Check 100 distributed values in the partitions.
  FROM sensor_log
 WHERE sensor_log_id IN (
         SELECT generate_series(1, 10000000, 100000)
-- Check 100 random values in the partitions.
  FROM sensor_log
 WHERE sensor_

[continue reading]

PostgreSQL 9.5 will have support for a feature that is popularly known as "UPSERT" - the ability to either insert or update a row according to whether an existing row with the same key exists. If such a row already exists, the implementation should update it. If not, a new row should be inserted. This is supported by way of a new high level syntax (a clause that extends the INSERT statement) that more or less relieves the application developer from having to give any thought to race conditions. This common operation for client applications is set to become far simpler and far less error-prone than legacy ad-hoc approaches to UPSERT involving subtransactions.

When we worked on UPSERT, many edge-cases were carefully considered. A technique called "unique index inference" allows DML statement authors to be very explicit about what condition they want to take the alternative (UPDATE or NOTHING) path on. That alternative path can only be taken in the event of a would-be duplicate violation (for the DO NOTHING variant, a would-be exclusion violation is also a possible reason to take the alternative NOTHING path). The ability to write UPSERT statements explicitly and safely while also having lots of flexibility is an important differentiator for PostgreSQL's UPSERT in my view.

As the 9.5 INSERT documentation explains, the inference syntax contains one or more column_name_index (columns) and/or expression_index expressions (expressions), and perhaps an optional index_predicate (for partial unique indexes, which are technically not constraints at all). This is internally used to figure out which of any available unique indexes ought to be considered as an arbiter of taking the alternative path. If none can be found, the optimizer raises an error.

The inference syntax is very flexible, and very tolerant of variations in column ordering, whether or not a partial unique index predicate is satisfied, and several other things. It can infer multiple unique indexes at a time, which is usually not necessary, but can be in th

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2015-10-02 at 15:03:00
A conversation yesterday with a client: "What changes were made in the PostgreSQL configs?" "I don't recall exactly." This is why you should keep all the configs under version control, and have a regular monitoring check that the controlled version is the same as the operating version, so if someone makes a manual change without checking it in the monitoring alarm will go off.

If you keep your config in the data directory, in the default PostgreSQL style, probably the simplest way to do this is to create a git repository with a symbolic link to the data directory and a .gitignore that ignores everything in the data directory but the config files. The add the config files and you're done. If you keep the config files separately, as Debian does, then you can probably just create a git repository right in the config directory if you don't already have one at a higher level.

I have been guilty of not doing this in the past, but I have come to the strong conclusion that this should be a part of standard best practice.
Posted by gabrielle roth on 2015-10-02 at 00:54:00

When: 6-8pm Thursday October 15, 2015
Where: Iovation
Who: Gavin McQuillan
What: Sharded Postgres with PLProxy

Data of Future Past: Postgres as Distributed Online Processing Analytics Engine

Postgres has long been a great analysis tool, but as organizations seek to decentralize their data, it’s overlooked for other solutions. Maybe this doesn’t need to be the case. Using a novel combination of tools, old and new, Postgres can also decentralize its data. We can maintain the immensely powerful interface that we love, while meeting (or exceeding) throughput decentralized datastores provide. In this talk we see how this is possible using Foreign Data Wrappers (PLProxy) and HyperLogLog (Postgres-HLL) extensions, detailing benefits and constraints to this method. We explore a reimplementation of a complex Online Processing Analytics hypercube in Hbase with Postgres and measure its effectiveness, throughput, latency, and cost savings.

Gavin is a longtime PostgreSQL enthusiast, specializing in building automation and analytics solutions, distributed systems, scaling and securing web applications. His experience includes Google SRE, software development and leading teams in young technology startups. He lives in Portland with his family, bicycles, and hand planes.

If you have a job posting or event you would like me to announce at the meeting, please send it along. The deadline for inclusion is 5pm the day before the meeting.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

Elevators open at 5:45 and building security closes access to the floor at 6:30.

See you there!

Andreas 'ads' Scherbaum
Pivotal hosted the September PostgreSQL Conference in Beijing. I posted about this event in the Pivotal blog.
When I first started working with docker last year, there was a clear pattern already out there the docker image itself only consists of application binary (and depending on the philosophy - the entire OS libraries that are required) and all application data goes in a volume.

Also the concept  called "Data Container" also seemed to be little popular at that time.  Not everyone bought into that philosophy and there were various other patterns emerging out then on how people used volumes with their docker containers.

One of the emerging pattern was (or still is)  "Data Initialization if it does not exist" during container startup.
Let's face it, when we first start a docker container consisting of say PostgreSQL 9.4 database the volume is an empty file system. We then do an initdb and setup a database so that it is ready to serve.
The simplest way is to  check if the data directory has data in it and if it does not have data, then run initdb and setup the most common best practices of the database and serve it up.

Where's the simplest place to do this? In the entrypoint script of docker container of course.

I did the same mistake in my jkshah/postgres:9.4 image too. In fact I still see that same pattern in the official postgres docker image also where it looks for PG_VERSION and if it does not exists then it runs initdb.

if [ ! -s "$PGDATA/PG_VERSION" ]; then
    gosu postgres initdb


This certainly has advantages:
1. Very simple to code the script.
2. Great Out of the box experience - You start the container up - the container sets itself up and it is ready to use.

Lets look what happens next in real life enterprise usages.

We got in scenarios while the applications using such databases are running but they lost all data in it. Hmm what's going wrong here? The application is working fine, the database is working fine, but all data is like it was freshly deployed and not something that was running well for 3-5 months.

Let's look at various activities that an enterprise will typically do with such

[continue reading]

Matthew Kelly, TripAdvisor's in-house PostgreSQL guru, spoke to us recently about PGConf Silicon Valley which is November 17-18 at the South San Francisco Conference Center. Early Bird pricing ends October 4th.

Matthew's talk at PGConf Silicon Valley 2015 is At the Heart of a Giant: Postgres at TripAdvisor. Matthew said the PostgreSQL database server has been at the core of the company's infrastructure since the "early days as a scrappy start up" 15 years ago.

PostgreSQL is the database powering TripAdvisor’s 250 million reviews from inquisitive travellers around the world. With more than 375 million unique monthly visitors, TripAdvisor is the world's largest travel site and has been helping travelers plan and book the perfect trip since 2000.

Matthew joined TripAdvisor first as an intern during his computer science studies at the University of Massachusetts Amherst and was hired full-time following his graduation in 2012. I sat down with him virtually recently and here's what he had to say…

Terry: How is Postgres used at TripAdvisor today?

Matthew: Postgres is the backbone of our entire core production infrastructure here at TripAdvisor. On any given day, our Postgres infrastructure throughout the company might have serviced more than 1 billion queries. These will range from the OLTP sort of operations you would expect from a high end website to analytics to real time anomaly processing to fraud detection. We even use it to hold the photos that our in our upload queue. We run numerous different hardware configurations and workloads. Sure we have Hadoop for some back office processing but Postgres really is the primary workhorse of our data infrastructure.

Terry: TripAdvisor gets a massive amount of traffic. Similar companies operating at that scale are using or considering WebscaleSQL – which as you know is a collaboration among engineers from several companies facing similar challenges in running MySQL at scale. How does PostgreSQL stack up against it in terms of forming the backbone of a popular site like you

[continue reading]

Since its introduction in 9.1, synchronous replication, or the ability to wait for a WAL flush confirmation from a standby before committing a transaction on the master node (depends on synchronous_commit whose default value is on), ensuring that the transaction committed has not been lost on the standby should a failover need to be done, has faced a wide adoption in many production environments for applications that need a no-data-loss scenario. Still, such application or clients may have seen something like that:

Cancel request sent
WARNING:  01000: canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been replicated to the standby.
LOCATION:  SyncRepWaitForLSN, syncrep.c:217

What happens here is that the commit query has remained stuck for a couple of seconds because it was keeping waiting for the flush confirmation that was not coming. Depending on the outage of the standby, this could take a while, so here what happened is that the query has been manually cancelled, then transaction has been committed on the master without the confirmation coming from the standby. One of the ways to check if a backend is being stuck similarly to the above is to have a look at the output of ps, to find something like that:

$ ps x | grep postgres | grep waiting
15323   ??  Ss     0:00.06 postgres: easteregg easteregg [local] COMMIT waiting for 0/1797488

But that's not particularly useful when user is not able to connect directly to the node involved (usually that should be the case for a superuser of this database instance), and the catalog table pg_stat_activity does not offer a way to know if the backend is really stuck at commit because of synchronous replication or because of another lock.

=# SELECT pid, state, waiting FROM pg_stat_activity
   WHERE lower(query) = 'commit;';
  pid  | state  | waiting
 15323 | active | f
(1 row)

Note that there is a patch submitted for integration t

[continue reading]

On October 8th at 17:30 the second Dutch PostgreSQL meetup will be held in Amsterdam. Please join us for some very interesting talks.

Andres Freund, a PostgreSQL committer and developer working at Citus Data, will speak about major scalability improvements in recent PostgreSQL versions and what the biggest remaining scalability concerns are. Some of the remaining issues can mainly be addressed by changing Postgres, others can be worked around on the application side.

Marco Slot from Citus Data will give a talk about the open source pg_shard extension, which lets you seamlessly shard and scale out PostgreSQL tables across many machines. We will look at how to use it for building a scalable document store and demonstrate some upcoming features.

Finally, Reiner Peterke from Splendid Data will show pg_insight, a flexible tool for collecting historical statistics information using the Postgres statistics infrastructure. It integrates with elastic search for graphical representation of the data from performance monitoring and trending.

You can RSVP on the PostgreSQL Meetup. Hope to see you there!

Come to the Dutch PostgreSQL Meetup in Amsterdam on October 8th

Talks by Andres Freund, Marco Slot, and Reiner Peterke

Modern servers provide people with more and more CPU cores. 16 cores or more are not uncommon these days. Unfortunately PostgreSQL cannot scale a single query to more than one core. A lot of effort is going into this issue already. Cybertec experiments To address the issue we have done some experiments to see, what […]
Posted by gabrielle roth on 2015-09-29 at 01:37:00
We’ve wrapped up another PostgresOpen! The official numbers aren’t in yet, but we were up a significant percentage from last year. We had so many last-minute registrants and walk-ups that we ran out of extra badges. Even better, I got to attend most of the talks on my list. I can’t pick a favorite, but […]
Posted by Tomas Vondra in 2ndQuadrant on 2015-09-28 at 17:40:00

In the last two post, I've briefly discussed results of a benchmark measuring PostgreSQL performance with different filesystems. The first post compared EXT4 and XFS, and the second one was mostly dealing with the problems I've ran into when benchmarking BTRFS.

In the discussion, I've been asked to benchmark ReiserFS. I haven't really planned to benchmark this file system because ... Let's say ReiserFS does not have the best reputation when it comes to reliability due to historical reasons. According to some sources the motto of the development was essentially "performance over correctness" which is approach database folks don't really like. I don't know how much that's true for the current state of ReiserFS, or how that applies to the new ReiserFS 4 (which is not merged and I don't think it'll ever happen).

But database people are rather conservative when it comes to storage reliability, so even if all the issues got fixed and the current version is reliable, it's unlikely to become very popular in this group soon. Factor in the fact that the development of ReiserFS pretty much stopped (in favor of the Reiser4, which did not get merged for years), so the code pretty much in maintenance mode only. Which is not necessarily bad, but it also means no significant reworks necessary for new features and performance improvements (which was the purpose of Reiser4).

So let's see if ReiserFS 3 actually performs so much better than the common file systems (EXT4 and XFS), as the users often believe ...

The short answer is "no". At least not on the workload measured by pgbench (a lot of random I/O) and on this particular hardware (good SSD), ReiserFS 3 actually performs worse than both XFS and EXT4.


On read-only benchmarks, all the filesystems perform pretty much exactly the same, irrespectedly of the amount of data. For example on the large data set, you'll get something like this:


The chart shows throughput (transactions per second) with different number of clients, for those three filesystems. And there's pret

[continue reading]

Posted by Pavel Stehule on 2015-09-28 at 04:32:00
I wrote simple extension session_exec. This extension enforce running some specified function before user is connected to PostgreSQL. This function can be used for some complex user management.


session_preload_libraries to session_exec
session_exec.login_name to login


# first login
[pavel@dhcppc4 plpgsql_check]$ psql
WARNING: function "login()" does not exist

postgres-# RETURNS void AS $$
postgres$# BEGIN
postgres$# IF current_database() <> 'admindb' AND session_user = 'postgres' THEN
postgres$# RAISE EXCEPTION 'user postgres is blocked';
postgres$# ELSE
postgres$# RAISE NOTICE 'Hello, %', session_user;
postgres$# END IF;
postgres$# RETURN;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;

# new login
[pavel@dhcppc4 plpgsql_check]$ psql
NOTICE: Hello, pavel

-- try to login as Postgres
postgres=# \c postgres postgres
FATAL: unhandled exception in login function "login"
DETAIL: user postgres is blocked
CONTEXT: PL/pgSQL function login() line 4 at RAISE
session_exec: perform login function "login"
Previous connection kept

This is example only. For this specific use case the modification of pg_hba.conf is better. But you can implement more complex logic in plpgsql. The login function can be specific for database or for user via ALTER DATABASE xx SET or ALTER ROLE xx SET.


When you use login function, the connect to database will be little bit slower. Use it only when it is necessary.

There are two PostgreSQL FDWs (currently maintained) I know of for connecting to SQL Server from a Linux/Unix PostgreSQL box. There is the TDS Foreign Data wrapper (tds_fdw driver) which relies on the Free TDS driver. This is a fairly light-weight FDW since it just relies on TDS which is commonly already available on Linux installs or an easy install away. Unfortunately when I tried to use it on windows (compiling my usual mingw64 way), while it compiled and installed, it crashed when I attempted to connect to my SQL Server 2008 R2 box table, so I gave up on it for the time being as a cross-platform solution. One thing I will say about it is that it accepts ad-hoc queries from what I can see, as a data source, which is pretty nice. So we may revisit it in the future to see if we can get it to work on windows. I'm not sure if tds_fdw would support SQL Server spatial geometry columns though would be interesting to try.

The second option, which as you may have noticed, we spent much time talking about is the ogr_fdw foreign data driver. ogr_fdw utilizes UnixODBC on Linux, iODBC on MacOSX and Windows ODBC on windows for connecting to SQL Server. The ogr_fdw big downside is that it has a dependency on GDAL, which is a hefty FOSS swiss-army knife ETL tool that is a staple of all sorts of spatial folks doing both open source and proprietary development. The good thing about ogr_fdw, is that since it is a spatial driver, it knows how to translate SQL Server geometry to it's equivalent PostGIS form in addition to being able to handle most of the other not-so spatial columns.

Continue reading "Connecting to SQL Server from Linux using FDWs"
Posted by Josh Berkus on 2015-09-26 at 03:49:03

On October 15th Josh Berkus will be reprising EXPLAIN Explained, his presentation from Postgres Open, at the Pandora offices in downtown Oakland. This talk will cover how to read EXPLAIN and EXPLAIN ANALYZE, and is aimed at developers and new DBAs. Pandora staff will also do a lightning talk.

On November 17th and 18th is pgConfSV. Registration is open and the full schedule is up. Be there!

On November 19th, the day after pgConfSV, Tableau is hosting an extended meetup which will cover using Tableau with PostgreSQL, Joe Conway on PL/R, and Álvaro Hernández Tortosa on Logical Decoding. This meetup will be in Palo Alto, and is expected to fill up very soon.

We are still looking to schedule a meetup in the city of San Francisco for late October or early November, but need a speaker for it. If you wanted to present, and didn't submit in time for pgConfSV, now's you chance. Contact the organizers through the meetup.

Posted by Shaun M. Thomas on 2015-09-25 at 17:07:41

This PG Phriday is going to be a bit different. During my trip to Postgres Open this year, I attended a talk I had originally written off as “some Red Hat stuff.” But I saw the word “containers” in the PostgreSQL in Containers at Scale talk and became intrigued. A few days later, I had something of an epiphany: I’ve been wrong about servers for years; we all have.

That’s a pretty bold claim, so it needs some background.

Fairly often in infrastructure and development, we see servers like this diagram:


But what’s the most important thing in the diagram? Can you even tell? Well, everything except the AppData component can be replaced. The primary goal of highly available design is building an architecture to ensure this is the case. I don’t care where the data lives, or what server is hosting it, so long as certain performance metrics are supplied by the underlying hardware. But from that diagram, the most important element would appear to be the servers themselves.

This kind of thinking is a trap, and one pretty much everyone I’ve met in the industry is guilty of falling into. In a two-server Postgres stack, where one server is the primary database and the other is a replica, we tend to gravitate toward preserving the original 1->2 orientation. This may be the case even if retaining this arrangement requires a second failover after an outage or some kind of maintenance. For some reason, the physical server gets the focus. How many times have you seen this? I can’t even count the number of times I’ve encountered servers named after their primary role as a database host, yet I have to physically log in to find out what the databases are. Sure, sometimes there’s a Wiki page of some kind outlining the data to host relationship, but is that really putting the data first?

That’s what I mean when I say we’ve all had it wrong. This obviously isn’t absolutely true, but the prevailing momentum still treats servers as important. They’re not. They’re a medium for whatever is running on them. Businesses with dozens or hundre

[continue reading]

Posted by Markus Winand on 2015-09-25 at 15:26:45

I've just launched my new website It's very small right now—just six content pages—but it will grow over next month and years and might eventually become a book like did. So I though I'd better get you on board right now so you can grow as the site grows.

So what is modern SQL about? Yet another SQL reference? Absolutely not. There are definitively enough references out there. And they all suffer from the same issues:

  • They hardly cover more than entry-level SQL-92. Some tutorials don't even mention that they describe a proprietary dialect.

  • The examples just show the syntax but not how to solve real-world problems.

  • They don't document the availability of the features amongst different database products.

No question, the latter ones are a consequence of the first one. However, modern SQL is different:

  • modern SQL covers SQL:2011—the current release of the ISO SQL standard. But it adheres to the fact that many databases are developed against elder releases or only support a subset of the standard.

  • modern SQL shows how to use the most recent SQL features to solve real world problems. It also provides “conforming alternatives” as a way to solve the same problem using other (often elder) standard methods. If necessary and possible it will also show “proprietary alternatives.”

    Note that there is a hierarchy: the recommended (most idiomatic) approach is shown first, standard conforming alternatives next and proprietary alternatives last but only if inevitable.

  • modern SQL documents the availability of the features amongst six SQL databases (the example is about values):

Go here if you don't see an image above.

Click on a feature (e.g., “valid where select is valid”) to see how the availability of the features has evolved.

There you can see that SQLite just recently started to accept values where select is valid. The timeline view also documents the last checked version for those databases that don't support a feature.

Inevitably, modern SQL will become both: an homage

[continue reading]

The recent release of PgBouncer 1.6, a connection pooler for Postgres, brought a number of new features. The two I want to demonstrate today are the per-database and per-use pool_modes. To get this effect previously, one had to run separate instances of PgBouncer. As we shall see, a single instance can now run different pool_modes seamlessly.

There are three pool modes available in PgBouncer, representing how aggressive the pooling becomes: session mode, transaction mode, and statement mode.

Session pool mode is the default, and simply allows you to avoid the startup costs for new connections. PgBouncer connects to Postgres, keeps the connection open, and hands it off to clients connecting to PgBouncer. This handoff is faster than connecting to Postgres itself, as no new backends need to be spawned. However, it offers no other benefits, and many clients/applications already do their own connection pooling, making this the least useful pool mode.

Transaction pool mode is probably the most useful one. It works by keeping a client attached to the same Postgres backend for the duration of a transaction only. In this way, many clients can share the same backend connection, making it possible for Postgres to handle a large number of clients with a small max_connections setting (each of which consumes resources).

Statement pool mode is the most interesting one, as it makes no promises at all to the client about maintaining the same Postgres backend. In other words, every time a client runs a SQL statement, PgBouncer releases that connections back to the pool. This can make for some great performance gains, although it has drawbacks, the primary one being no multi-statement transactions.

To demonstrate the new pool_mode features, I decided to try out a new service mentioned by a coworker, called Scaleway. Like Amazon Web Services (AWS), it offers quick-to-create cloud servers, ideal for testing and demonstrating. The unique things about Scaleway is the servers are all ARM-based SSDs. Mini-

[continue reading]

Posted by Paul Ramsey on 2015-09-24 at 00:00:00

This is it, PostGIS 2.2 is almost out the door, so we’re looking for testing and feedback! Please give the release candidate a try and report back any issues you encounter.

View all closed tickets for 2.2.0.

Posted by Terry Erisman in CitusData on 2015-09-23 at 15:57:47

I am pleased to announce that the full schedule of speakers and events for the first annual PGConf Silicon Valley conference is now available on the new conference website. Breakout session tracks include DevOps, New Features, PostGIS, Tales from the Trenches, Hacking Postgres, and Data at Scale.

In addition, we welcome two major new organizations to the list of PGConf Silicon Valley sponsors, Pivotal at the Platinum level and Amazon Web Services at the Silver level. Organized in cooperation with the San Francisco PostgreSQL Users Group, PGConf Silicon Valley is a technical conference aimed at the Silicon Valley PostgreSQL community and beyond. 

The conference is November 17-18, 2015 at the South San Francisco Conference Center. Early Bird registration for PGConf Silicon Valley is available through Oct. 4, 2015. Register now to attend and receive an additional 20% off your registration fees when you use the discount code CitusData20 at checkout!

Breakout Sessions & Tutorials

The full breakout session schedule includes:

  • Matthew Kelly, In-House Postgres Expert at TripAdvisor, on “At the Heart of a Giant: Postgres at TripAdvisor”
  • Ryan Lowe, Production Engineer & John Cesario, Platform Engineer at Square, on “Postgres for MySQL DBAs”
  • Magnus Hagander, Database Architect, Systems Administrator & Developer at Redpill Linpro, on “What’s New in PostgreSQL 9.5”
  • Paul Ramsey, Solutions Engineer at Cartodb, on “This is PostGIS”
  • Peter Geoghegan, Database Engineer at Heroku, on “UPSERT use cases”
  • Grant McAlister, Senior Principal Engineer at, on “Cloud Amazon RDS for PostgreSQL - What’s New and Lessons Learned”
  • Gavin McQuillan, Senior Data Engineer at Urban Airship, on “Data of Future Past: Postgres as Distributed Online Processing Analytics Engine”
  • Samantha Billington, Database Manager at Turnitin, on “...Lag - What's wrong with my slave?”
  • Andreas Scherbaum, Advisory Consultant at EMC Deutschland GmbH, on “Enrich your data with geocoordinates from OpenStreetMap or ArcGIS”
  • Dan Robinson, Lead Engineer at Heap, on “P

[continue reading]

On 8th of September, Alvaro Herrera committed patch: Allow per-tablespace effective_io_concurrency   Per discussion, nowadays it is possible to have tablespaces that have wildly different I/O characteristics from others. Setting different effective_io_concurrency parameters for those has been measured to improve performance.   Author: Julien Rouhaud Reviewed by: Andres Freund There is this, not very well […]
On 7th of September, Jeff Davis committed patch: Add log_line_prefix option 'n' for Unix epoch.   Prints time as Unix epoch with milliseconds.   Tomas Vondra, reviewed by Fabien Coelho. This could be useful, as it's simpler to calculate differences in epoch times than it is in timestamps. So, let's see. Current, I have: $ […]
In my previous post I have shared some GPU performance data, which were mostly related to aggregation. Given the information I have found on the pgstrom wiki page, I decided to give joins a try to see, how much speed we can gain by offloading some of the work PostgreSQL has to do to the […]
Recently while working with various applications in a docker container, we came across few containers that will not run properly unless privileged mode is enabled. The privileged mode gives the container the same rights as host which means it can make changes on host where the container runs. (Huge difference compared to VM - Imagine your VM making changes to the hypervisor directly.)

Of course privileged mode has its uses and I am definitely glad that it is available. However it is not a general purpose option to be used lightly. So imagine my surprise that one of the most common tools that is used in many enterprises now Chef server when running in a docker container also required privileged mode to run. There are various versions available but they all required the mode.

While investigating Chef Server to see why it requires the mode I found it primarily requires it to set some ulimit parameters and a specific kernel parameter inside the container.

sysctl -w kernel.shmmax=17179869184

Now before you say, aha simple lets change the value in the host itself and let the container pick up the value from the host itself.. Let me say been there .. it ain't gonna work.  The reason it does not work is due to how Linux namespaces work with CLONEIPC. The net result is everytime a container is created a new namespace of System V IPC is setup with the default  shmmax of 32MB.  The default will be changed in a later Linux kernel to 4GB but of course like most companies there will not be patience to wait for the Linux kernel to show up let alone a certified Linux distro for production setups.

There are few hacks to work it out as Jerome indicates in a mailing list.  But of course none of them was something that was suitable.

Now lets go back to the original command that needed to be executed which required. I have worked with those commands for years always to increase shared memory for databases that uses Sys V style of shared memory like Oracle, PostgreSQL (well till 9.2), etc.

Guess what doing a little digging I di

[continue reading]

PostgreSQL’s TABLESAMPLE brings a few more advantages compared to other traditional ways for getting random tuples.

TABLESAMPLE is a SQL SELECT clause and it provides two sampling methods which are SYSTEM and BERNOULLI. With the help of TABLESAMPLE we can easily retrieve random rows from a table. For further reading about TABLESAMPLE you can check the previous blog post.

In this blog post we’ll talk about alternative ways of getting random rows. How people were selecting random rows before TABLESAMPLE, what are the pros and cons of the other methods and what we gained with TABLESAMPLE?

There are awesome blog posts about selecting random rows, you can start reading the following blog posts to gain a deep understanding of this topic.

My Thoughts of Getting Random Row

Getting Random Rows from a Database Table


Let’s compare the traditional ways of getting random rows from a table with the new ways provided by TABLESAMPLE.

Before the TABLESAMPLE clause, there were 3 commonly used methods for randomly selecting rows from a table.

1- Order by random()

For testing purposes we need to create a table and put some data inside of it.

Let’s create ts_test table and insert 1M rows into it:

CREATE TABLE ts_test (
  title TEXT

INSERT INTO ts_test (title)
    'Record #' || i
    generate_series(1, 1000000) i;

Considering the following SQL statement for selecting 10 random rows:

SELECT * FROM ts_test ORDER BY random() LIMIT 10;

Causes PostgreSQL to perform a full table scan and also ordering. Therefore this method is not preferred for tables with large number of rows because of performance reasons.

Let’s look into EXPLAIN ANALYZE output of this query above:

random=# EXPLAIN ANALYZE SELECT * FROM ts_test ORDER BY random() LIMIT 10;
 Limit (cost=33959.03..33959.05 rows=10 width=36) (actual time=1956.786..1956.807 rows=10 loops=1)

[continue reading]

After installing PostgreSQL 9.4 and PostGIS following An Almost Idiot's guide to installing PostgreSQL, PostGIS, and pgRouting, on my CentOS 6.7 64-bit except replacing 9.3 references with equivalent 9.4 reference, I then proceeded to install ogr_fdw. To my disappointment, there are no binaries yet for that, which is not surprising, considering there aren't generally any binaries for any OS, except the windows ones I built which I will be packaging with PostGIS 2.2 windows bundle. Getting out of my windows comfort zone, I proceeded to build those on CentOS. Mainly because I have a client on CentOS where ogr_fdw I think is a perfect fit for his workflow and wanted to see how difficult of a feat this would be. I'll go over the steps I used for building and stumbling blocks I ran into in this article with hope it will be of benefit to those who find themselves in a similar situation.

Continue reading "Compiling and installing ogr_fdw on CentOS after Yum Install PostgreSQL / PostGIS"
Posted by Dave Cramer in credativ on 2015-09-17 at 21:41:00
Mostly some small bug fixes that didn't get in 1202.

  • fix: Implemented getFunctions
  • fix: changed getProcedureColumns to getFunctionColumns
  • fix: CopyManager fails to copy a file, reading just part of the data #366
  • add: Added PGTime/PGTimestamp
  • fix: setObject(int parameterIndex, Object x, int targetSqlType) as it will set scale of BigDecimal 'x' to 0 as default, resulting in rounded whole values (!). PR #353 (24312c6)
  • fix: round to correct amount test: add test for BigDecimal rounding behaviour in setObject(index,Object,targetSqlType) and setObject(index,Object,targetSqlType,scale) PR #353 (ff14f62)

Posted by Marco Slot in CitusData on 2015-09-17 at 21:28:22

Docker is quickly becoming one of the most popular ways of deploying distributed applications. By bundling all dependencies of an application in an easily shippable container, software deployment becomes a process that can be performed quickly and often. The open source sharding extension for PostgreSQL, pg_shard, and scalable real-time analytics solution for PostgreSQL, CitusDB, are both meant to run on a cluster of PostgreSQL servers. Deploying such a cluster can become a lot easier with Docker.

While there is no officially supported Docker container for Citus Data extensions at the moment, we were very excited to learn that Heap has published citus-docker. Heap uses CitusDB to analyze their click stream data in real-time with some advanced funnel queries. Their Docker image comes with both pg_shard and CitusDB pre-installed. If you haven't used docker before, you can follow the installation guide to set it up.

One of the benefits of Docker is that it lets you set up a whole cluster on your machine for testing very easily. pg_shard users often set up a cluster on their desktop, running multiple postgres servers on different ports. This approach is not very practical since it requires you to go through all the configuration steps multiple times. With docker-compose, setting up a local cluster becomes a breeze.

If you haven't done so already, you can install docker-compose with the following command:

sudo su
curl -L`uname -s`-`uname -m` > /usr/bin/docker-compose
chmod +x /usr/bin/docker-compose

Now run the following:

git clone
cd citus-docker
docker-compose up -d

That's it! No additional configuration required. You now have a local pg_shard cluster with 2 worker nodes and a master node to which you can connect to using: psql -h localhost -U postgres. When you are done, you can remove it using:

docker-compose kill
docker-compose rm

You could also start individual nodes by running the docker comm

[continue reading]