The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Posted by Josh Berkus in pgExperts on 2014-10-21 at 21:18:00
One of the things I mentioned in my series on VACUUM FREEZE was that we really needed a Postgres utility which would opportunistically freeze tables during low traffic periods. Today I'm announcing the Flexible Freeze project, our first attempt at designing such a utility.

All that's there right now is a simple Python script.  However, that script is already a useful tool, installed at multiple production sites. Here's how the script works:
  1. identify your active databases and daily/weekly low traffic periods.
  2. create a cron job which calls with a time limit to keep it inside your low traffic window.
  3. will loop through your tables with the oldest XIDs, freezing them until it runs out of time or out of tables
There is also a second mode, using the --vacuum switch, which does VACUUM ANALYZE on the tables with the most dead rows (according to pg_stat_user_tables).  This is to help users who have a strong high/low traffic cycle and want to make sure that regular vacuuming takes place during low traffic.  If you're running both modes, we advise doing the freeze first.

Of course, I have a tanker-truck full of desired improvements/expansions to this.  So, pull requests welcome. 

If you're more into Ruby, Wanelo has rewritten flexible freeze for Rails and incorporated it into their Postmodern tool.

I’m seeing reports of a number of issues with PostgreSQL after upgrades of OS X machines to Yosemite (OS X 10.10) that I’m concerned about, so I’m seeking more information about the experiences of PostgreSQL users who’ve done OS X 10.10 upgrades.

I can’t confirm anything yet, but back up all your databases before any upgrade to OS X 10.10. Just in case. (Of course, you do that before any upgrade, but just in case it slipped your mind this time…).

I don’t have access to a Mac because Apple’s policy prevents developers from running OS X for testing and development (or anything else) without buying physical Apple hardware and finding somewhere to run it. So I can’t test most of this myself, and I really need reports from users, or if possible, results of proactive testing by OS X users.

OS X built-in PostgreSQL deleted on update

Some OS X users appear to use the PostgreSQL version built-in to OS X for their own data, rather than installing a new PostgreSQL. Some of them, in addition to using the binaries, also use a PostgreSQL cluster (database instance) that’s created by OS X for the use of, instead of initdbing their own.

On releases prior to Yosemite the PostgreSQL provided by Apple was on the default PATH, though not necessarily running by default. It seems that on Yosemite it’s been removed; there’s no longer any /usr/bin/psql, etc. As far as I can tell now bundles PostgreSQL within the application bundle instead.

Some user reports suggest that on upgrade, the Apple-controlled databases in the new release are migrated into the new cluster managed by then the old cluster is stopped or possibly deleted – a colleage checked the upgrade script and found rm -rf /var/pgsql in it.

The PostgreSQL data directory in prior releases was /private/var/pgsql (and /var is a symlink to /private/var) or /Library/Server/PostgreSQL/Data.

The main symptom you’ll see is:

Connection refused
Is the server running locally and accepting
connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432

[continue reading]

Posted by robert berry on 2014-10-21 at 00:00:00

Monitoring Postgresql with a Background Worker

Oct 21, 2014 – Portland

pgantenna and pgsampler comprise an experimental Postgreqsql monitoring framework. This post explores how they work, and what problems they aim to solve.

Framework Overview

pgsampler is a background worker which collects data in a Postgresql cluster. It can log this data to CSV files or ship the metrics to a pgantenna instance over a tcp connection.

pgantenna is an application shipped as a Docker image which receives pgsampler data. It provides a web interface for live monitoring, checks for alerting conditions, and allows for psql access to a historical database of cluster metrics.


There are a number of high quality monitoring and performance analysis tools for Postgresql. Many of these involve a remote service which connects to Postgresql as a regular client, or an application that parses log files.

The presented framework uses a background worker to ship statistics to a remote service. It aims to solve a grab bag of real or imagined problems discussed below. Of course, this approach presents it’s own problems and is thus best characterized as an experiment.

Live Monitoring

Data is sent from the cluster in a polling loop at second intervals. Different metrics can be tuned to desired sampling rates.

Using Postgres to Monitor Postgres

Dashboard plots and alert conditions are all written directly in SQL. For example, alert conditions are triggered whenever a cron-executed query returns a NULL in the first field in the first record. Plots are rendered with plotpg.

Historical Analysis with SQL

pgantenna provides a containerized remote cluster which stores historical data separate from transactional systems. The history is just a Postgresql database that can be queried with familiar tools.

Easy to Configure

The background worker uses direct access to identify and connect to databases automatically. Security concerns notwithstanding, this allows for very little configuration minutae to get started with comprehensive monitoring.

Close t

[continue reading]

Yesterday I had an interesting discussion on irc. A guy wanted to know why Perl script is causing problems when dealing with Pg and unicode characters. The discussion went sideways, I got (a bit) upset, and had to leave anyway, so I didn't finish it. But it did bother me, as for me the reasons […]
Following feature is less important for performance, but for somebody can be important for aesthetic reasons - now you can use a styles for unicode table borders. Possible styles are only two, but you can set a border, header and column style. It is a 6 combinations. Next you have a 3 styles for borders generally - so it together 18 possible combinations of psql table output:
postgres=# \pset unicode_header_linestyle double 
Unicode border linestyle is "double".
postgres=# \pset linestyle unicode
Line style is unicode.
postgres=# \l
List of databases
Name │ Owner │ Encoding │ Collate │ Ctype │ Access privileges
postgres │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │ =c/postgres ↵
│ │ │ │ │ postgres=CTc/postgres
template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │ =c/postgres ↵
│ │ │ │ │ postgres=CTc/postgres
(3 rows)

postgres=# \l
List of databases
Name Owner Encoding Collate Ctype Access privileges
═════════ ════════ ════════ ═══════════ ═══════════ ═════════════════════
postgres postgres UTF8 en_US.UTF-8 en_US.UTF-8
template0 postgres UTF8 en_US.UTF-8 en_US.UTF-8 =c/postgres ↵
template1 postgres UTF8 en_US.UTF-8 en_US.UTF-8 =c/postgres ↵
(3 rows)

postgres=# \pset border 2
Border style is 2.
postgres=# \l
List of databases
│ Name │ Owner │ Encoding │ Collate │ Ctype │ Access privileges │

[continue reading]

The POODLE attack on https (the attack is about https, the vulnerability in SSL, an important distinction) has received a lot of media attention lately, so I figured a (very) short writeup was necessary.

The TL;DR; version is, you don't have to worry about POODLE for your PostgreSQL connections when using SSL.

The slightly longer version can be summarized by:

  • The PostgreSQL libpq client in all supported versions will only connect with TLSv1 and newer, which is not vulnerable.
  • The PostgreSQL server prior to the upcoming 9.4 version will however respond in SSLv3 (which is the vulnerable version) if the client insists on it (which a third party client can do).
  • To exploit POODLE, you need a client that explicitly does out-of-protocol downgrading. Something that web browsers do all the time, but very few other clients do. No known PostgreSQL client library does.
  • To exploit POODLE, the attacker needs to be able to modify the contents of the encrypted stream - it cannot be passively broken into. This can of course happen if the attacker can control parameters to a SQL query for example, but the control over the data tends to be low, and the attacker needs to already control the client. In the https attack, this is typically done through injecting javascript.
  • To exploit POODLE, there needs to be some persistent secret data at a fixed offset in each connection. This is extremely unlikely in PostgreSQL, as the protocol itself has no such data. There is a "cancel key" at the same location in each stream, but it is not reused and a new one is created for each connection. This is where the https attack typically uses the session cookie which is both secret and fixed location in the request header.

For a really good writeup on the problem, see this post from PolarSSL, or this one from GnuTLS.

Introduced in PostgreSQL 9.4, pg_recvlogical has the ability to control the creation of logical replication slots from which logical changes can be streamed. Note that in the case this is a mandatory condition when using logical decoding. pg_receivexlog does not have in 9.4 any control on the physical replication slots it may stream from (to ensure that the WAL segment files this utility is looking for are still retained on the server side). This feature has been added for 9.5 with the following commit:

commit: d9f38c7a555dd5a6b81100c6d1e4aa68342d8771
author: Andres Freund <>
date: Mon, 6 Oct 2014 12:51:37 +0200
Add support for managing physical replication slots to pg_receivexlog.

pg_receivexlog already has the capability to use a replication slot to
reserve WAL on the upstream node. But the used slot currently has to
be created via SQL.

To allow using slots directly, without involving SQL, add
--create-slot and --drop-slot actions, analogous to the logical slot
manipulation support in pg_recvlogical.

Author: Michael Paquier

This simply introduces two new options allowing to create or drop a physical replication slot, respectively --create-slot and --drop-slot. The main difference with pg_recvlogical is that those additional actions are optional (not --start option introduced as well for backward-compatibility). Be careful of a couple of things when using this feature though. First, when a slot is created, stream of the segment files begins immediately.

$ pg_receivexlog --create-slot --slot physical_slot -v -D ~/xlog_data/
pg_receivexlog: creating replication slot "physical_slot"
pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)

The slot created can then be found in the system view pg_replication_slots.

=# select slot_name, plugin, restart_lsn from pg_replication_slots ;
   slot_name   | plugin | restart_lsn
 physical_slot | null   | 0/1000000
(1 row)

Then, when dropping a slot, as process can stream nothing it exits immedia

[continue reading]

After being on the road to do PostgreSQL consulting for Cybertec for over a decade I noticed that there are a couple of ways to kill indexing entirely. One of the most favored ways is to apply functions or expressions on the column people want to filter on. It is a sure way to kill […]
When working with PostgreSQL you generally want to get information about slow queries. The usual approach is to set log_min_duration_statement to some low(ish) value, run your app, and then analyze logs. But you can log to many places – flat file, flat file on another disk, local syslog, remote syslog. And – perhaps, instead of […]
Posted by Josh Berkus in pgExperts on 2014-10-11 at 00:19:00
To accompany the New Index Bloat Query, I've written a New Table Bloat Query.  This also involves the launch of the pgx_scripts project on GitHub, which will include most of the "useful scripts" I talk about here, as well as some scripts from my co-workers.

The new table bloat query is different from the version in several ways:
  • Rewritten to use WITH statements for better maintainability and clarity
  • Conditional logic for old Postgres versions and 32-bit platforms taken out
  • Index bloat removed, since we have a separate query for that
  • Columns formatted to be more immediately comprehensible
In the course of building this, I found two fundamentally hard issues:
  1. Some attributes (such as JSON and polygon fields) have no stats, so those tables can't be estimated.
  2. There's no good way to estimate bloat for compressed (TOAST) attributes and rows.
Also, while I rewrote the query almost entirely, I am still relying on Greg's core math for estimating table size.  Comparing this with the results of pgstattuple, I'm seeing an error of +/- 20%, which is pretty substantial.  I'm not clear on where that error is coming from, so help improving the math is very welcome!

Results look like this:

  databasename | schemaname |   tablename   | pct_bloat | mb_bloat | table_mb   
members_2014 | public | current_member | 92 | 16.98 | 18.547
members_2014 | public | member_response | 87 | 17.46 | 20.000
members_2014 | public | archive_member | 84 | 35.16 | 41.734
members_2014 | public | survey | 57 | 28.59 | 50.188

pct_bloat is how much of the table (0 to 100) is estimated to be dead space.  MB_bloat is how many megabytes of bloat are estimated to exist.  Table_mb is the actual size of the table in megabytes.

The suggested criteria is to list tables which are either more than 50% bloat and bigger than 10MB, or more than 25% bloat and bigger than 1GB.  However, you should calibrate

[continue reading]

On 7th of October, Alvaro Herrera committed patch: Implement SKIP LOCKED for row-level locks   This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the […]

SKIP LOCKED is a new feature associated with row-level locking that has been newly-introduced in PostgreSQL 9.5 by this commit:

commit: df630b0dd5ea2de52972d456f5978a012436115e
author: Alvaro Herrera <>
date: Tue, 7 Oct 2014 17:23:34 -0300
Implement SKIP LOCKED for row-level locks

This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro

Let's take for example the simple case of the following table that will be locked:

=# CREATE TABLE locked_table AS SELECT generate_series(1, 4) as id;

Now a session is taking a shared lock on the row created of locked_table, taking the lock within a transaction block ensures that it will still be taken for the duration of the tests.

=# SELECT id FROM locked_table WHERE id = 1 FOR SHARE;
(1 row)

Now, the shared lock prevents any update, delete or even exclusive lock from being taken in parallel. Hence the following query will wait until the transaction of previous session finishes. In this case this query is cancel by the user (note that error message tells for which row this query was waiting for):

=# SELECT * FROM locked_table WHERE id = 1 FOR UPDATE;
^CCancel request sent
ERROR:  57014: canceling statement due to user request
CONTEXT:  while locking tuple (0,1) in relation "locked_table"
LOCATION:  ProcessInterrupts, postgres.c:2966

There is already one way to bypass this w

[continue reading]

Posted by Mark Wong in 2ndQuadrant on 2014-10-10 at 00:35:01

This another simple test in continuation from last time. We will start with the same lineitem table as in the previous example. We will measure the time it takes to load the same 7.2GB text file repeatedly until the table size grows to about 1TB. We create a baseline with a table that has no indexes built on it. Then repeat with a B-tree index on the l_shipdate DATE column, and again after replacing the B-tree index with a BRIN index.


Our baseline shows that as the table grows the time it takes to insert data also increases. The difference in the time that it takes to insert data when the table is near 1TB compared to when it is empty is about 12 seconds. With the B-tree index in place the difference increases to 84 seconds. Finally the change is only about 15 seconds with the BRIN index in place.

So over a 1TB growth, the overheard on inserting data into the lineitem table due to just the size of the table increases about 4.3%. B-trees increase that difference to 12.2%. While the BRIN index continues to look encouraging by only increasing the overhead to 4.2%.

The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n°318633 – the AXLE project –

Posted by Greg Sabino Mullane in EndPoint on 2014-10-09 at 15:44:00

Manny Calavera (animated by Lua!)
Image by Kitt Walker

Someone on the #postgresql IRC channel was asking how to make a copy of a schema; presented here are a few solutions and some wrinkles I found along the way. The goal is to create a new schema based on an existing one, in which everything is an exact copy. For all of the examples, 'alpha' is the existing, data-filled schema, and 'beta' is the newly created one. It should be noted that creating a copy of an entire database (with all of its schemas) is very easy: CREATE DATABASE betadb TEMPLATE alphadb;

The first approach for copying a schema is the "clone_schema" plpgsql function written by Emanuel Calvo. Go check it out, it's short. Basically, it gets a list of tables from the information_schema and then runs CREATE TABLE statements of the format CREATE TABLE (LIKE INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING DEFAULTS). This is a pretty good approach, but it does leave out many types of objects, such as functions, domains, FDWs, etc. as well as having a minor sequence problem. It's also slow to copy the data, as it creates all of the indexes before populating the table via INSERT.

My preferred approach for things like this is to use the venerable pg_dump program, as it is in the PostgreSQL 'core' and its purpose in life is to smartly interrogate the system catalogs to produce DDL commands. Yes, parsing the output of pg_dump can get a little hairy, but that's always preferred to trying to create DDL yourself by parsing system catalogs. My quick solution follows.

pg_dump -n alpha | sed '1,/with_oids/ {s/ alpha/ beta/}' | psql

Sure, it's a bit of a hack in that it expects a specific string ("with_oids") to exist at the top of the dump file, but it is quick to write and fast to run; pg_dump creates the tables, copies the data over, and then adds in indexes, triggers, and constraints. (For an explanation of the sed portion, visit this post). So this solution works very well. Or does it? When playing with this, I found that there is on

[continue reading]

As a followup to my previous blog post, A Large Database Does Not Mean Large shared_buffers, I had some more interesting findings applying the queries in that blog post to another client recently. I assume you have read that one already and don’t repeat any of what I explained previously, so if you haven’t read that one and aren’t familiar with the pg_buffercache extension, I highly recommend you go read that one first.

Another mantra often heard in PostgreSQL circles that you usually don’t want to set shared_buffers higher than 8GB. I will admit, that for a majority of clients, that is great advice and a good starting point (and a whole lot more useful than the default 32MB). There are also issues around double-buffering and allowing the kernel to do what it can probably do better than PostgreSQL as far as managing page reads/writes (a topic way out of the scope of this blog post). But if you investigate further into how PostgreSQL is using its shared memory and what your high demand data blocks actually are, you can possibly find benefit in setting it higher. Especially when you can clearly see what PostgreSQL thinks it needs most often. Or if you can just fit the whole thing into memory, as I stated before.

The client in these examples has shared_buffers set to 24Gb and the total database size is 145GB (111GB in the primary followed by 28GB, 5GB,  270MB & 150MB). I say small in the title of this post, but both large and small are relative terms and for my typical work this is a small database. And a setting that is 17% of the total size is larger than normal, so along with being a catchy followup name, the results do fit the title.

So I ran the basic query at the end of my previous post to see what the “ideal” minimal is. I ran this several times over about a half-hour period and, unlike the databases in my previous post, it did not deviate much.

database=# SELECT pg_size_pretty(count(*) * 8192) as ideal_shared_buffers
FROM pg_buffercache b
WHERE usagecount >= 3;

[continue reading]

Posted by Jim Mlodgenski in OpenSCG on 2014-10-07 at 17:25:44

I recently attended a excellent meetup about Redshift and one of the comments by the presenter was the trouble of the running of the UNLOAD command. The trouble they were having was that the UNLOAD command takes an SQL statement as a parameter, but if that SQL statement has strings, you need to escape everything which makes it fairly unreadable.

We can see an example of this in PostgreSQL using the dblink extension:

  FROM dblink('dbname=postgres', 'SELECT * FROM test WHERE b = ''2014-02-02''')
    AS t(a int, b date);

Since Redshift is a derivative of PostgreSQL, the dollar quoting syntax also works. Dollar quoting is a non-standard way of denoting string constants, but it makes things much simpler to read.

  FROM dblink('dbname=postgres', $$ SELECT * FROM test WHERE b = '2014-02-02' $$)
    AS t(a int, b date);
Posted by Joshua Drake in CommandPrompt on 2014-10-07 at 17:11:03
As a PostgreSQL consultant you end up working with a lot of different types of clients and these clients tend to all have different requirements. One client may need high-availability, while another needs a DBA, while yet another is in desperate need of being hit with a clue stick and while it is true that there can be difficult clients, there is no bad client.
What!!! Surely you can't be serious?

Don't call me shirley.

I am absolutely serious.

A bad client is only a reflection of a consultants inability to manage that client. It is true that there are difficult clients. They set unrealistic expectations, try to low ball you by with things like: "We can get your expertise for 30.00/hr from India" or my favorite: calling you directly when it is after hours to "chat".

How are these not bad clients? They are not bad clients because it is you that controls the relationship with the client. You as the consultant have to set proper boundaries with the client to insure that the relationship as a whole is positive and profitable. If you can't manage that relationship you have two choices:

  1. Hire someone who can
  2. Fire the client

Woah! Fire the client? Yes. Terminate the relationship with the client.

It is always amazing to me how many people can't fathom the idea of firing a client. It is always some sacred vow that a client can fire you but you are left holding the bag, somehow that bag is filled with the feces of some dog and you are expected to light it on fire and leave it on the porch of some unsuspecting high-school football coach.[1]

The counter argument to this is usually "I need the money". This is a valid argument but do you need the money so badly that you are willing to sacrifice your health or your relationships? It is astonishing how many consultants are willing to do exactly that. In the words of the legendary band Big Fun, "Suicide, don't do it"[2].

The better you manage a client, the better the relationship. Good luck!


[continue reading]

Posted by gabrielle roth on 2014-10-07 at 04:55:38
I’ve spent the past few weeks learning my way around Amazon’s RDS offering (specifically Postgres, and a bit of elasticache). It’s a mixed bag so far; for every feature I think “Hey, this is neat!” I find at least one or two others that are not so thrilling. One of the things that may annoy […]
Posted by Andrew Dunstan in pgExperts on 2014-10-06 at 21:06:00
pg_repack is a terrific tool for allowing you to reorganize a table without needing to hold long running strong locks on the table. That means that that your normal inserts, updates and deletes can continue to run against the table while the reorganization is proceeding.

I have had clients who have run into problems with it, however. In particular, it is possible to get it wedged so that the table is inaccessible and nothing can proceed, unless you either kill the repack operation or kill what is blocking it. Here is a simple example of how to cause problems.

In session 1, do:
pg_reorg -e -t foo dbnameset

and in session 2 in psql do:
select pg_sleep(10); lock table foo; rollback;
The sleep gets us past the time when pg_reorg is setting up, and happens while it is is doing its CREATE TABLE ... AS SELECT .... When that CREATE TABLE statement finishes, both sessions will be wedged.  Session 2 will be hung because it is unable to lock the table, since pg_reorg's other session will hold a weak lock on the table. And nothing, including pg_reorg, will be able to do anything with the table.

The solution is to make sure that nothing holds or even tries to obtain any strong long running locks on the table.

One useful thing is to use the monitor script to look for things like long running transactions and processes waiting for locks.

Or you can create a more customized test to look for this exact situation.

Most importantly, you need to be aware that problems can occur, and to protect against them happening in the first place.

Posted by Kirk Roybal on 2014-10-06 at 15:53:52

Our topic for November, “If you know PostgreSQL, then you know Big Data”. The HUGEdata Tech Team and Principal Data Scientist will share an overview of our Scale out SQL database that leverages PG admin client access. We will provide a demonstration, that includes a marketing example for customer segmentation. We’ll also talk about machine data, the internet of things, and other use case for Big Data. And, we’d love interaction with the group on the challenges they’ve faced scaling Postgres and their ideas on how to position our Analytics Platform to the community.

What: HugeData and PostgreSQL

Who: Beth Lahaie and the HugeData team

When: Wednesday, November 5, 2014

Improving Enterprises
16633 Dallas Parkway Suite 110 Addison, TX 75001

DFW PUG on Meetup

I have been looking at the new Block Range Indexes (BRIN) being developed for PostgreSQL 9.5. BRIN indexes are designed to provide similar benefits to partitioning, especially for large tables, just without the need to declare partitions. That sounds pretty good but let’s look in greater detail to see if it lives up to the hype.

How large? Here’s one data point. Using the TPC Benchmark(TM) H provided dbgen we created data for the lineitem table at the 10GB scale factor, which results in a 7.2GB text file.

We’re going to compare a couple of basic tasks. The first look will be at the impact of inserting data into a table using the COPY command. We will do a simple experiment of creating a table without any indexes or constraints on it and time how long it takes to load the lineitem data. Then repeat with a B-tree index on one column. And finally repeat again with a BRIN index instead of a B-tree index on the same column.

axle-load-testThe above bar plot shows the average times over five measurements. Our baseline of loading the lineitem table without any indexes averaged 5.1 minutes. Once a B-tree index was added to the i_shipdate DATE column, the average load time increased to 9.4 minutes, or by 85%. When the B-three index was replaced by a BRIN index, the load time only increased to 5.6 minutes, or by 11%.

The next experiment is to average how long it takes to create a B-tree index on a table that is already populated with data. Then repeat that with a BRIN index. This will be done on the same i_shipdate DATE column and repeated for a total of five measurements each.




The B-tree index took 95 seconds to build, where the BRIN index 18 seconds to build, an 80% improvement.

That’s very encouraging. The overhead to loading data into a table from a single BRIN index is only 11%, and reduced the total load time by 40% when compared to having a B-tree index. And creating a new BRIN index takes only 20% of the time that a new B-tree index would take. We will have more experiments lined up to see where else BRIN indexes may or ma

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2014-10-03 at 17:37:00
For years I have been wanting to set up a farm of machines, modelled after the buildfarm, that will run some benchmarks and let us see performance regressions. Today I'm publishing some progress on that front, namely a recipe for vagrant to set up an instance on AWS of the client I have been testing with. All this can be seen on the PostgreSQL Buildfarm Github Repository on a repo called aws-vagrant-benchfarm-client. The README explains how to set it up. The only requirement is that you have vagrant installed and the vagrant-aws provider set up (and, of course, an Amazon AWS account to use).

Of course, we don't want to run members of the benchfarm on smallish AWS instances. But this gives me (and you, if you want to play along) something to work on, and the provisioning script documents all the setup steps rather than relying on complex instructions.

The provisioner installs a bleeding edge version of the buildfarm client's experimental Pgbench module, which currently only exists on the "benchfarm" topic branch. This module essentially runs Greg Smith's pgbench-tools suite, gets the results from the results database's "tests" table, and bundles it as a CSV for upload to the server.

Currently the server does nothing with it. This will just look like another buildfarm step. So the next thing to do is to get the server to start producing some pretty and useful graphs. Also, we need to decide what else we might want to capture.

Row-level security is a new feature of PostgreSQL 9.5 that has been introduced by this commit:

commit: 491c029dbc4206779cf659aa0ff986af7831d2ff
author: Stephen Frost <>
date: Fri, 19 Sep 2014 11:18:35 -0400
Row-Level Security Policies (RLS)

Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table.  Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.

Behind this jargon is a feature that could be defined in short words as a complementary permission manager of GRANT and REVOKE that allows controlling at row level which tuples can be retrieved for a read query or manipulated using INSERT, UPDATE or DELETE. This row control mechanism is controlled using a new query called CREATE POLICY (of course its flavor ALTER POLICY to update an existing policy and DROP POLICY to remove a policy exist as well). By default, tables have no restrictions in terms of how rows can be added and manipulated. However they can be made able to accept level restriction policies using ALTER TABLE and ENABLE ROW LEVEL SECURITY. Now, let's imagine the following table where a list of employees and their respective salaries can be read (salary is an integer as this is entirely fictive situation and refers to no real situation, quoique...):

=# CREATE TABLE employee_data (id int,
       employee text,
       salary int,
       phone_number text);
=# CREATE ROLE jeanne;
=# INSERT INTO employee_data VALUES (1, 'ceo', 300000, '080-7777-8888');
=# INSERT INTO employee_data VALUES (2, 'jeanne', 1000, '090-1111-2222');
=# INSERT INTO employee_data VALUES (3, 'bob', 30000, '090-2222-3333');

Now let's set

[continue reading]

Posted by gabrielle roth on 2014-10-03 at 01:27:00

When: 6-8pm Thu Oct 16, 2014
Where: Iovation
What: PgOpen Recap (gabrielle, Mark, John M);  New Relic Instrumentation of Pg Queries (Andrew)

Please note the new earlier meeting time! We’ll try this over the winter.

Two topics this month:  PgOpen attendees will discuss highlights of that conference, and Andrew will talk about some New Relic-y stuff.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak).  It’s right on the Green & Yellow Max lines.  Underground bike parking is available in the parking garage;  outdoors all around the block in the usual spots.  No bikes in the office, sorry!

Elevators open at 5:45 and building security closes access to the floor at 6:30.

Posted by Josh Berkus in pgExperts on 2014-10-02 at 23:30:00
If you've been paying any attention at all, you're probably wondering why 9.4 isn't out yet.  The answer is that we had to change JSONB at the last minute, in a way that breaks compatibility with earlier betas.

In August, a beta-testing user reported that we had an issue with JSONB not compressing well.  This was because of the binary structure of key offsets at the beginning of the JSONB value, and the affects were dramatic; in worst cases, JSONB values were 150% larger than comparable JSON values.   We spent August through September revising the data structure and Heikki and Tom eventually developed one which gives better compressibility without sacrificing extraction speed.

I did a few benchmarks on the various JSONB types.  We're getting a JSONB which is both faster and smaller than competing databases, so it'll be worth the wait.

However, this means that we'll be releasing an 9.4beta3 next week, whose JSONB type will be incompatible with prior betas; you'll have to dump and reload if you were using Beta 1 or Beta 2 and have JSONB data.  It also means a delay in final release of 9.4.
On 19th of September, Stephen Frost committed patch: Row-Level Security Policies (RLS)   Building on the updatable security-barrier views work, add the ability to define policies on tables to limit the set of rows which are returned from a query and which are allowed to be added to a table. Expressions defined by the policy […]
Posted by Craig Kerstiens on 2014-10-01 at 07:00:00

Most web applications will add/remove columns over time. This is extremely common early on and even mature applications will continue modifying their schemas with new columns. An all too common pitfall when adding new columns is setting a not null constraint in Postgres.

Not null constraints

What happens when you have a not null constraint on a table is it will re-write the entire table. Under the cover Postgres is really just an append only log. So when you update or delete data it’s really just writing new data. This means when you add a column with a new value it has to write a new record. If you do this requiring columns to not be null then you’re re-writing your entire table.

Where this becomes problematic for larger applications is it will hold a lock preventing you from writing new data during this time.

A better way

Of course you may want to not allow nulls and you may want to set a default value, the problem simply comes when you try to do this all at once. The safest approach at least in terms of uptime for your table –> data –> appliction is to break apart these steps.

  1. Start by simply adding the column with allowing nulls but setting a default value
  2. Run a background job that will go and retroactively update the new column to your default value
  3. Add your not null constraint.

Yes it’s a few extra steps, but I can say from having walked through this with a number of developers and their apps it makes for a much smoother process for making changes to your apps.

Posted by Payal Singh in OmniTI on 2014-09-30 at 20:12:00
A while ago a got a task to change the owner of a group of functions. While the number of functions wasn't too high, it was still enough that I began looking at ways to change the owner in a batch, instead of having to manually change it for each function.
In case of other database objects, changing owners is fairly simple. It can be accomplished in two steps:

1. Get list of all tables/sequences/views:

payal@testvagrant:~$ psql -qAt -c "SELECT 'ALTER TABLE '||schemaname||'.'||tablename||' OWNER TO new_owner;' FROM pg_tables WHERE schemaname = 'payal'" > test.txt

This will give us the following file:

payal@testvagrant:~$ cat test.txt
ALTER TABLE payal.new_audit_users OWNER TO new_owner;
ALTER TABLE payal.v_count_states OWNER TO new_owner;
ALTER TABLE payal.test OWNER TO new_owner;
ALTER TABLE payal.old_audit_users OWNER TO new_owner;
ALTER TABLE payal.old_audit OWNER TO new_owner;
ALTER TABLE payal.adwords_dump OWNER TO new_owner;
ALTER TABLE payal.affiliate OWNER TO new_owner;
ALTER TABLE payal.new_affiliate OWNER TO new_owner;
ALTER TABLE payal.partest OWNER TO new_owner;
ALTER TABLE payal.audit_test OWNER TO new_owner;
ALTER TABLE payal.batatawada OWNER TO new_owner;
ALTER TABLE payal.dup_key_err OWNER TO new_owner;
ALTER TABLE payal.new_audit OWNER TO new_owner;

2. Now all that is needed is to run this file with psql:

payal@testvagrant:~$ psql < test.txt

That simple! An alternate solution can be found here.

However, things get a little tricky with functions due to argument specifications. Basically, one needs to specify a function's arguments along with the function name to alter it. For example:

ALTER FUNCTION hstore.tconvert(text, text) OWNER TO hstore;

Using the method described above to changed owner for tables, you cannot get the function arguments from pg_proc. Instead, postgres has a function pg_get_function_identity_arguments(func_oid) t

[continue reading]

Posted by Tomas Vondra on 2014-09-30 at 15:00:00

This is the post I promised last week, explaining a few common issues with memory contexts. The issues mostly lead to excessive overhead, i.e. excessive usage of resources - usually memory. And that's exactly what this post is about, so whenever I say "overhead" you can read "excessive memory usage." I will also try to give advices on how to avoid those issues or minimize the impact.

As I briefly mentioned when explaining allocation set internals, there are three main sources of overhead:

  • chunk headers - for large chunks, this gets negligible
  • unused space (because of 2^N chunks) - expected ~25% for randomly sized chunks, but can get much worse
  • reuse not working efficiently - we'll see some examples how this can happen

If you haven't read that post, it's probably the right time to do that. Also, if you want to learn more about memory management and allocator implementations, there's a great post at IBM developerWorks explaining it quite well and also listing many interesting additional resources (e.g. various malloc implementations).

So let's see some usual (but somehow unexpected) examples of palloc overhead.

Allocating many tiny pieces of memory

Say we need to store a lot of small elements, a few bytes each - e.g. 64-bit integers, words in an ispell dictionary or something like that. You could do a separate palloc call for each element (and keep the pointer), but in that case you'll pay the 'header' price for each element. For example by doing this

int64 *items[1024];
for (i = 0; i < 1024; i++)
    items[i] = (int*)palloc(sizeof(int64));

you might think you allocated ~8kB of memory (1024 x 8B), but in fact this prepends each value with 16B header. So you end up with about 24kB (not counting the items array, which needs additional 8kB), which is ~3x the requested amount. If we asked for smaller values (e.g. 4B integers), the overhead would be even larger because 8B is the smallest chunk allocated by palloc (as mentioned in the previous post).

Let's assume all the elements have about the same life span / scop

[continue reading]

Posted by gabrielle roth on 2014-09-30 at 01:58:19
Yep, I’m headed to Madrid! I’ll be reprising my Autovacuum talk from SCALE, and am really looking forward to meeting some new folks. I’ll be helping out at the conference in some capacity, so come say hello. For reference, the conference schedule is here: Other talks I plan to attend: Wednesday: Performance Archaeology sounds […]