The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at

First some background–I’ve always had a bit of a love hate relationship with ORMs. ORMs are great for basic crud applications, which inevitably happens at some point for an app. The main two problems I have with ORMs is:

  1. They treat all databases as equal (yes, this is a little overgeneralized but typically true). They claim to do this for database portability, but in reality an app still can’t just up and move from one to another.
  2. They don’t handle complex queries well at all. As someone that sees SQL as a very powerful language, taking away all the power just leaves me with pain.

Of course these aren’t the only issues with them, just the two ones I personally run into over and over.

In some playing with Node I was optimistic to explore Massive.JS as it seems to buck the trend of just imitating all other ORMs. My initial results–it makes me want to do more with Node just for this library. After all the power of a language is the ecosystem of libraries around it, not just the core language. So let’s take a quick tour through with a few highlights of what makes it really great.

Getting setup

Without further adieu here’s a quick tour around it.

First let’s pull down the example database from PostgresGuide

Then let’s setup out Node app:

$ npm init
$ npm install massive --save

Connecting and querying

Now let’s try to connect and say query a user from within our database. Create the following as an index.js file, then run with node index.js:

var massive = require("massive");
var connectionString = "postgres://:@localhost/example";

var db = massive.connectSync({connectionString : connectionString});

db.users.find(1, function(err,res){

Upon first run if you’re like me and use the PostgresGuide example database (which I now need to go back and tidy up), you’ll get the following:

db.users.find(1, function(err,res){
TypeError: Cannot read property 'find' of undefined

I can’t describe how awesome it is to see this. What’s happening is when Massive loads up it’s connecting to you

[continue reading]

Posted by Amit Kapila in EnterpriseDB on 2015-11-30 at 05:57:00

Parallelism is now reality in PostgreSQL.  With 9.6, I hope we will see many
different form of queries that can use parallelism to execute.  For now, I will
limit this discussion to what we can already do, which is Parallel Sequential

Parallel Sequential Scans are used to scan a relation parallely with the help of
background workers which in turns improve the performance of such scans.  I
will discuss about the scenarios where user can expect a performance boost
due to this feature later in this blog, but first let us understand the basic feature
and how it works.  Three new GUC parameters have been added to tune the
usage of this feature.

max_parallel_degree - This is used to set the maximum number of workers that
can be used for an individual parallel operation.  It is very well possible that the
requested number of workers are not available at execution time.  Parallel workers
are taken from the pool of processes established by max_worker_processes which
means that value of max_parallel_degree should be lesser than max_worker_processes.
It might not be useful to set the value of this parameter more than the number of CPU
count on your system.

parallel_tuple_cost - This is used by planner to estimate the cost of transferring a
tuple from parallel worker process to master backend.  The default is 0.1.  The more
the number of tuples that needs to be passed from worker backend processes to
master backend process, the more this cost will be and more overall cost of
parallel sequential scan plan.

parallel_setup_cost - This is used by planner to estimate the cost of launching parallel
worker processes and setting up dynamic shared memory to communicate.
The default is 1000.

Now let us see the simple example to demonstrate how parallel sequential scan works:
 create table tbl_parallel_test(c1 int, c2 char(1000));   
insert into tbl_parallel_test values(generate_series(1,1000000),'aaaaa');
Analyze tbl_parallel_test;
Explain analyze select * from tbl_parallel_test where c1 < 10000 and
c2 like '%bb%

[continue reading]

Posted by Paul Ramsey on 2015-11-29 at 00:53:00

I attended PgConf Silicon Valley a couple weeks ago and gave a new talk about aspects of PostGIS that come as a surprise to new users. Folks in Silicon Valley arrive at PostGIS with lots of technical chops, but often little experience with geospatial concepts, which can lead to fun misunderstandings. Also, PostGIS just has a lot of historical behaviours we've kept in place for backwards compatibility over the years.

Thanks to everyone who turned out to attend!

RPostgresql : R and PostgreSQL Database

Working with RPostgreSQL package

How to pass dynamic / runtime parameter to dbGetQuery in RPostgrSQL ?
#use stri_paste to form a query and pass it into dbGetQuery icd = 'A09' require(stringi) qry <- stri_paste("SELECT * FROM visualisation.ipd_disease_datamart WHERE icd ='", icd, "'",collapse="") rs1 <- dbGetQuery(con, qry)
Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (cannot allocate a new connection --
maximum of 16 connections already opened)
library(RPostgreSQL) drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="DBName", host="",port=5432,user="yes",password="yes")
close the connection. max 16 connection can able to establish from R to PostgreSQL, if exceeds this limit, will throw error.
##list all the connections

## Closes the connection

## Frees all the resources on the driver
#OR on.exit(dbUnloadDriver(drv), add = TRUE)

How to close/drop all the connection Postgresql session.?
We can terminate the PostgreSQL connection using "pg_terminate_backend" SQL command.
In my case I was open up 16 connection using RPostgreSQL unfortunately forget to release them. So I ended up with Max. connection exceed limit.
SELECT pg_terminate_backend( FROM pg_stat_activityWHERE client_addr = '' and pid > 20613 AND pid pg_backend_pid();
In above query, pg_stat_activity will return list of all the active connection.
I have terminating only the connection from R session which made from the (client_addr) IP
Next :

While doing PostgreSQL consulting for a German client, I stumbled over an interesting issue this week, which might be worth sharing with some folks out on the Internet, it’s all about grouping. Suppose you are measuring the same thing various times on different sensors every, say, 15 minutes. Maybe some temperature, some air pressure or […]

The post Flexible grouping: Some dirty SQL trickery appeared first on Cybertec - The PostgreSQL Database Company.

Posted by Vasilis Ventirozos in OmniTI on 2015-11-24 at 15:59:00
Its been almost a year since 9.4 was released, it included many nice features, but the one that changed my day to day administration was replication slots.

In OmniTI , we use something called OmniPITR a lot, OmniPITR could be called a wal management suite, it can be used to wal ship, to replay wals to clean up wals when they are not needed and it can be used for backups. Usually the way I use it is to first set up streaming replication to a replica, and on top of streaming replication setup wal shipping replication, the reason is simple, i want to have all the wals around in case i lose the slave and i didn't really like the wal_keep_segments approach.

Backups are usually being taken from a slave in order to reduce load from the master.

Briefly, OmniPITR does this by sending a pg_start/stop_backup to the master, puts a pause removal file on the slave so wals are kept and makes 2 tar.gz files for the base backup and the wals.

Usually i have a backup server keeping all the wal files and the backups, which means a second (direct or indirect) wal shipping destination.
All this is nice and it works really well but with replication slots this could be more simple and more efficient.

Recently, I had to setup a backup for a 1 master 2 slave setup on 9.4.

The requirements of the backup would be the following:

  • backup will be taken from the slave.
  • backup will be incremental allowing PITR
  • backup will be stored remotely on another machine.
  • minimum possible wal shipping targets.
  • replicas should be identical and adding a new replica should be trivial.
  • backup should be initiated from the server that keeps backups and not from a slave.
  • backups should be tested by restoration every week.

Pretty straight forward stuff really.

I setup archiving to the backup server with a simple scp over rsync command,
archive_command = 'rsync -a %p postgres@<Backup server IP>:/data/walarchive/%f'

I created a backup user that will use .pgpass
touch ~/.pgpass ; chmod 0600 ~/.pgpass

added :

<master ip>:5432:template1:backup:pgbackup

[continue reading]

Posted by Federico Campoli on 2015-11-24 at 11:11:00
Three days to go for the next Brighton PostgreSQL meetup.
I'll run a live hangout of the talk.

You can join the event there.

The record will become available on youtube shortly after the talk's end.
PostgreSQL RDBMS has a very informative built-in tools to track the state of the database. One such tool is pg_stat_activity. This is a system view that allows to monitor the databases processes in real time. This view is comparable to the system command top, it is one of the first places from which database administrator can start an investigation if some problem occurs. In this post I explain some useful examples how this view can be used to detect abnormal activity. To start, we need just some postgresql client, psql for example.

As a first step you can use an easy query:
# SELECT * FROM pg_stat_activity;

It is not important which database you are connected to, pg_stat_activity is shared among databases.
pg_stat_activity shows single line for the each database connection. There is a dedicated UNIX process for each connection. For one connection there are several attributes:
pid - process ID which handles the connection. In the PostgreSQL terminology this process is called backend or worker.
datid, datname - ID and name of the database to which client connects.
usesysid, usename - ID and name which was used to connect.
client_addr, client_port, client_hostname - network settings of the client, network address, port and hostname.
application_name - an arbitrary name that can be specified when client connects or a session varialble.
backend_start, xact_start, query_start, state_change - timestamps, which indicate when the process (backend), transaction or query within a transaction has been started, and the last time the state of the process was changed.
state, waiting - state of the process, and a flag which indicates if the process is waiting for another process.
backend_xid, backend_xmin - running transaction ID (xid) and transaction ID (xmin) which determines visibility scope for currently running transaction.
query - text of the query which currently runs by a worker or was recently executed in this connection.

Since each line describe a single connection, we can easily find out how many client connecti

[continue reading]

We just pushed out installers for PostGIS 2.2.0 for PostgreSQL 9.5beta2 windows both 32-bit and 64-bit on Application Stackbuilder. These installers are also available as standalone listed on PostGIS windows page. This is the first PostGIS 2.2.0 release for the PostgreSQL 9.5 32-bit and a rerelease for PostgreSQL 9.5 x 64-bit (this time compiled against beta2 instead of beta1).

On quick testing the PostGIS 2.2 beta1 release and pgRouting 2.1.0 worked fine on 9.5beta2, however you may want to reinstall anyway just to be safe. You can just reinstall over your existing install, no need to uninstall first. Similarly just upgrading a PostgreSQL 9.5beta1 to 9.5beta2 seemed to not require pg_upgrade or dump/restore, so safe to just upgrade from 9.5beta1 to 9.5beta2. Other notes about this 9.5beta2 PostGIS 2.2.0 release:

  • The FDW API changed between PostgreSQL 9.5beta1 and PostgreSQL 9.5beta2, so the OGR_FDW, if you don't reinstall the bundle, will crash and burn in PostgreSQL 9.5beta2 (using PostGIS 2.2. beta1 executables). Similarly this newly compiled OGR_FDW will not work on PostgreSQL 9.5beta1 (so upgrade to 9.5beta2 first).
  • The PostgreSQL 9.5betas (that includes both beta1 and beta2), are compiled against the pointcloud 1.1 master branch. This was required because the released pointcloud 1.0.1, does not compile against PostgreSQL 9.5
  • The PostgreSQL 9.5beta2 PostGIS 2.2.0 release comes packaged with SFCGAL 1.2.2 (instead of 1.2.0 like the others versions) which fixes a crasher with ST_StraightSkeleton as noted in ticket - Newer SFCGAL will be packaged with upcoming PostGIS 2.2.1, but if you are on an older edition and are using SFCGAL, you can always copy latest SFCGAL.dll binaries from the 2.2.1dev packages on PostGIS windows page
I have just released version 4.16 of the PostgreSQL Buildfarm client

It can be downloaded from

Several bugs have been fixed, and there are these non-trivial changes:

  • capturing of TAP test logs
  • bin check enabled on Windows
  • rm_worktrees feature
  • an experimental module to run Sepgsql tests
  • try database shutdown following a start failure
  • report all PGBuild module versions to server

rm_worktrees is enabled by putting this in your config file:

rm_worktrees => 1,

The effect is that at the end of a run the checked out work tree is removed, leaving just the git repository, which in turn might be mostly linked to your HEAD branch if you use the git_use_workdirs option. Before a run, the work tree will be checked out again. The net effect is a significant saving in space used. with these two options, the additional space for each branch except when it's actually building is reduced to less than 30Mb. On crake, the git mirror, branch trees and cache now come in at about 1.5Gb. That's a lot less than it used to be. The additional cost of checking out the worktree each time is very modest.

Shutdown after a start failure tries to remedy a situation where we try to start the server, and don't detect that it has started, but it has in fact started. So now if we get a failure we try to shut down any server that might have started, just in case. This change is possibly redundant given the recent change where postgres detects that its data directory has disappeared and shuts down when it has, but it's probably worth having anyway.

Posted by Shaun M. Thomas on 2015-11-20 at 16:45:06

It has occurred to me that I may have been spending a bit too much time being excited about new Postgres features and developments in the community. One of the intents of this weekly article was for educational purposes, so this week, let’s get back to basics. To that end, the topic for this week boils down to the tools available for managing Postgres instances, and how to use them. Surprisingly, it’s not as straight-forward as you might think.

Having used Postgres for almost 15 years now, it’s easy to forget (or mentally block!) the early struggles. With new tools and wrappers always being developed, it’s not always clear what the best practices for managing a Postgres cluster actually are. Indeed, it often depends on how Postgres is installed.

Let’s start with a basic source-code installation. I won’t cover that process, but we can assume that after the smoke clears, the binaries are somewhere in /usr/bin or /usr/local/bin, and are thus in our execution path. Given that, let’s say we have an available mount point at /data and want to create a new cluster there. Here’s how that might go:

sudo mkdir /data/pgsql
sudo chown postgres:postgres /data/pgsql
sudo su - postgres
initdb -D /data/pgsql/my_db
pg_ctl -D /data/pgsql/my_db start

We now have an active Postgres instance at /data/pgsql/my_db. It’s extremely common to use the postgres OS user for this kind of thing, hence all of our sudo commands to prepare. It’s entirely possible to do this as a regular user, but I usually don’t recommend that approach.

In any case, this type of installation essentially depends on the pg_ctl command-line tool. It does everything related to controlling a Postgres instance. But it’s also annoying to use the -D parameter all the time when using Postgres tools, so there are several environment variables that can also do the job. This lets us prime our environment with .bashrc, for example. Let’s stop the instance:

export PGDATA=/data/pgsql/my_db
pg_ctl stop -m fast

Why the -m fast part? By default, Postgres is ex

[continue reading]

The Paxos algorithm is a powerful building block for building highly available distributed systems. Paxos can be seen as a function paxos(k,v) that returns the same value on all servers in a group for a certain key (k), and the value is one of the inputs (v). Paxos is most commonly used to implement log replication through a technique called Multi-Paxos. In Multi-Paxos, nodes call paxos using the indexes in a log as keys and state changes (e.g. 'set primary to node2') as values. Using this technique, data can be kept consistent and available across multiple servers, even if some of them fail.

We recently started exploring Paxos as a technique for doing automated fail-over of PostgreSQL servers and wrote a simple implementation of Paxos and Multi-Paxos including basic support for membership changes in about 1000 lines of PL/pgSQL. We found PL/pgSQL to be an excellent tool for implementing Paxos, because the necessary transactional semantics are a natural part of the language. Other consensus algorithms would have been harder to implement in this way, because they rely on timers and other background tasks. 

The pg_paxos extension demonstrates how the Paxos functions can be used for fault-tolerant, consistent table replication in PostgreSQL. While a replicated table has high write and read latencies due to the network round-trips involved, it can be very useful for applications such as automated fail-over. We plan to add optimisations such as the ones applied in Google Megastore to reduce the overhead and develop the extension further to be able to replace components like etcd or Zookeeper.

Posted by Tomas Vondra in 2ndQuadrant on 2015-11-19 at 14:30:00

While discussing results of various filesystem benchmarks - both here and in talks given on conferences, one of the things I've been questioning was the negligible impact of TRIM/DISCARD mount option.

I was speculating that maybe TRIM only really matters when the drive gets almost full, because until then the controller has enough room where to write the incoming data without getting the internal garbage collection under pressure. But I've recently did a a few pgbench runs aiming to test exactly this, using a rather large scale (filling more than 90GB of the 100GB drive), yet still no diference.

Another speculation was that maybe this particular SSD is really good and there's so much additional overprovisioning that the TRIM still makes no difference. After all, it's a good SSD frin Intel (S3700) meant for write-intensive data center applications, so I wasn't entirely surprised by that.

But then it dawned upon me ...


The TRIM does not matter because pgbench does not discard any pages, it simply overwrites the pages in place, so there's nothing to TRIM.

What pgbench does is that (a) it writes new data by adding new pages and (b) updates those pages, by rewriting them in place. None of that involves discarding pages - the controller knows which page of the file it's overwriting, and therefore knows which SSD pages are dirty and need to be erased. Similarly for WAL segments, that are reclaimed and overwritten in-place.

Maybe the TRIM does matter for other workloads (e.g. creating a lot of temporary files on the SSD device), or in databases where tables are often dropped or truncated. But for the simple pgbench workloads, TRIM does not matter at all.

Posted by Pavel Golub in MicroOLAP on 2015-11-18 at 09:48:28

I’m using this perfect tool called Sublime Text 3 for a bunch of tasks. One of them is viewing SQL scripts from time to time. ST3 has perfect SQL highlighting, but what I miss the most is the context help functionality, e.g. I select “LEFT JOIN” and hit F1 hot key. But that’s not problem since ST3 has a lot of packages. To solve my problem I need GotoDocumentation package and some tuning. Here is my settings for GotoDocumentation:

"docs": {
// obj containing the docs for each scope
// these are merged with the default ones
// the key value pair represent scope -> doc url
// supported placeholders:
// - %(query)s the selected text/word
// - %(scope)s the current scope
"sql": ""

"pascal": "",
// if we have no docs for the current scope
// we will try using the fallback one,
// to disable set to false
"fallback_scope": "google"

ST3 context help for pg sql

ST3 context help for pg sql

Filed under: Coding, PostgreSQL Tagged: development, PostgreSQL, SQL, sublime text
Posted by Marko Tiikkaja on 2015-11-18 at 02:30:00
Recently I used perf to look into what could be the cause for our increased CPU usage on the PostgreSQL server (encouraged by Andres' great talk at  I was somewhat surprised to find that thirty percent of the CPU time used by postgres was spent spinning on spinlocks, i.e. doing no actual useful work.  Digging into the profile a bit more, most of these were coming from a function called
On 11th of November, Robert Haas committed patch: Generate parallel sequential scan plans in simple cases.   Add a new flag, consider_parallel, to each RelOptInfo, indicating whether a plan for that relation could conceivably be run inside of a parallel worker. Right now, we're pretty conservative: for example, it might be possible to defer applying […]
On 30th of October, Tom Lane committed patch: Implement lookbehind constraints in our regular-expression engine.   A lookbehind constraint is like a lookahead constraint in that it consumes no text; but it checks for existence (or nonexistence) of a match *ending* at the current point in the string, rather than one *starting* at the current […]
Posted by Tomas Vondra in 2ndQuadrant on 2015-11-16 at 19:00:00

As you may be aware, PostgreSQL splits data files into 8kB pages by default. It's possible to use different page sizes (up to 32kB), but that requires compiling custom packages so almost no one does that. It's quite rare to come across an installation using different page size at a customer, for example.

Why 8kB data pages and not a different size? Clearly, smaller or larger pages might be more efficient in some cases, and indeed some other databases use different page sizes. For example InnoDB uses 16KB pages by default, and some databases even allow a mix of page sizes. So there seems to be a gain in using different page sizes, otherwise engineers working on those products would not waste time implementing it.

So what factors determine the optimal page size? And how do SSD disks change the results?

One of the fundamental rules about page size, called Five Minute Rule, was formulated by Jim Gray and G. F. Putzolu in 1985, and approaches the problem as an economic question "Given disk and RAM prices, how long should a page of given size be kept in memory?" The equation expressing the rule is essentially this

break_even_interval = (pages_per_MB * price_per_MB_RAM) * (IOPS_per_disk * price_per_disk)

and for values common in 1985 (1kB pages and disk/RAM prices in 1985), the break even interval came out as roughly 5 minutes (hence the name of the rule).

However since 1985 a lot has changed - the prices of RAM and disks plummeted, performance of rotational storage slightly improved and so on. Despite that, the 1997 revision of the paper came to about the same conclusion - the break even interval is still about 5 minutes, however assuming 8kB page size (not 1kB as used in the 1985 paper).

This is probably also the reason why PostgreSQL uses 8kB pages - Postgres95 (as it was called at that time) was released shortly before the paper was published, but it had to follow the same reasoning.

But again, a lot has changed since 1997 - the RAM/disk prices continued to plummet and, probably more importantly, flash storage is

[continue reading]

Posted by Dimitri Fontaine in 2ndQuadrant on 2015-11-16 at 13:18:00

I had the pleasure to be invited to speak at All Your Base Conference 2015 about PostgreSQL (of course). The conference gathers together lots of user experience around data management and database products, either in the now classic meaning of the word (I mean relational database management systems here) or the newer set of trade-offs represented by the NoSQL set of tools.

The conference was very nicely organized and ran smoothly, and I got the unique chance to get a back from the trenches story series about people tools of choices and the reason why they do things their way. It has been very enlightning!

If you ever get the chance to attend All Your Base, take my word for it and just go!

Posted by Vladimir Borodin on 2015-11-16 at 13:00:00

People having experience with commercial RDBMS are used to have the ability to answer the question “What a particular session is doing right now?” Or even “What was that session waiting 5 minutes ago?” For a long time PostgreSQL did not have such diagnostic tools and DBAs used to get out with different ways of sophistication. I gave a talk on (in Russian) about how we do it. This talk was collaborative with Ildus Kurbangaliev from PostgrePro. And Ildus was just speaking about tool that allows to answer questions above.

Strictly speaking it is not the first try to implement what people used to call wait [events] interface, but all previous attempts were not brought to some reasonable state and died as proof of concept patches. But pg_stat_wait is currently available as a set of patches to current stable 9.4 branch and currently developing 9.6 (actual versions should be looked at pgsql-hackers@).

After quite long testing and fixing bugs we even deployed them to production.


Before it all becomes part of core PostgreSQL you need to recompile postgres. I think description of rebuilding as ./configure && make && sudo make install is meaningless — much better to look into documentation.

After it you should add pg_stat_wait to shared_preload_libraries. Additionally, you can add following options to postgresql.conf:

  • waits_monitoring = on - enabling functionality on,
  • pg_stat_wait.history = on - storing history of wait events,
  • pg_stat_wait.history_size = 1000000 - number of last events to keep in history,
  • pg_stat_wait.history_period = 1000 - how often should wait events be stored in history (ms).

After that you should restart PostgreSQL and make CREATE EXTENSION pg_stat_wait. After that everything will start working.


What exactly will start to work? First you may look at what is inside the extension:

rpopdb01g/postgres M # \dxS+ pg_stat_wait
           Objects in extension "pg_stat_wait"
                   Object Description

[continue reading]

Andreas 'ads' Scherbaum

FOSDEM PGDay is a one day conference that will be held ahead of FOSDEM in Brussels, Belgium, on Jan 29th, 2016. This will be a one-day focused PostgreSQL event, with a single track of talks. Registration is required to attend, the registration will open soon. Since we have a limited number of seats available for this event, we urge everybody to register as soon as possible once open.

PostgreSQL Europe will also have our regular devroom at FOSDEM on Sunday the 31st, which will be held at the main FOSDEM venue at ULB. This day will, of course, continue to be free of charge and open to all FOSDEM entrants. No registration is required to attend this day.

For full details about the conference, venue and hotel, see

The call for papers is now open for both these events. We are looking for talks to fill both these days with content for both insiders and new users. Please see for details and submission information.

The deadline for submissions is December 7th, 2015, but we may as usual pre-approve some talks, so get your submissions in soon!

We will provide a special rate with the Brussels Marriott Hotel. For details, see

One small PostgreSQL 9.5 feature I worked on is the new hinting mechanism feature, which sometimes hints, based on a score, what you might have meant to type following misspelling a column name in an SQL query. The score heavily weighs levenshtein distance. A HINT message is sent to the client, which psql and other client tools will display by default.

It's common to not quite recall offhand if a column name is pluralized, or where underscores are used to break up words that make up the name of a column. This feature is targeted at that problem, providing guidance that allows the user to quickly adjust their query without mental context switching. For example:

postgres=# select * from orders where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: select * from orders where order_id = 5;
HINT:  Perhaps you meant to reference the column "orders"."orderid".

You may also see a hint in the case of two possible matches, provided both matches have the same score, and the score crosses a certain threshold of assumed usefulness:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where order_id =...
HINT:  Perhaps you meant to reference the column "o"."orderid" or the column "ol"."orderid".

If an alias was used here (which this query must have anyway), the hint becomes more specific:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where o.order_id = 5;
ERROR:  42703: column o.order_id does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where o.order_id...
HINT:  Perhaps you meant to reference the column "o"."orderid".

This feature should make writing queries interactively in psql a bit more pleasant. Mental context switching to figur

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2015-11-14 at 15:20:00
Fresh from yesterday's help file:

Say you want to insert a record into your table with all the default values set. Later on you'll update some of the values. You might want to do this so you can get the row's ID field. So you try this:
insert into mytable values () returning my_id;
but you find it doesn't work. VALUES isn't allowed to have an empty list. Here's what you do:
insert into mytable values (default) returning my_id;
This will insert the default value explicitly into first column in the table, and implicitly into all the other columns. It doesn't matter if any of the columns doesn't have an explicit default value set - in that case the default is NULL. Of course, that means this won't work if you have NOT NULL columns without a default set.
Posted by Abdul Yadi on 2015-11-14 at 05:26:49

I have gone through an interesting small size C code package for QR encoding at Its logic is contained in two small files: QR_Encode.h (3.7KB) and QR_Encode.c (61KB). Then it is sun-shiny Saturday morning as I manage to add in-memory monochrome bitmap construction (1 bit per pixel) and wrap the whole package as PostgreSQL extension module. I share it at qrcode.tar.bz2.

Please modify PG_CONFIG = /opt/pgsql/9.4/bin/pg_config in Makefile as necessary. Then make and make install.

Connect to a database then invoke sql command CREATE EXTENSION qr;

SELECT qr('QR Code with PostgreSQL', 0, 0, 4);

  1. Text to be encoded.
  2. Error correction level (0 to 3).
  3. Accepted model number (0 to 2).
  4. Scale 4 means: a dot in QR image will be 4 pixel width and 4 pixel height.

It returns byte array representing monochrome bitmap. You can save it as a file or directly render it to HTML page.
QRCode Demo

Posted by Bruce Momjian in EnterpriseDB on 2015-11-13 at 19:30:01

My daughter Catherine, then 12 years old, was interviewed at Postgres Open 2013 by Craig Kerstiens and Selena Deckelmann. Community members who have heard the 23-minute recording have found it humorous , and I have recently received permission to release it to a wider audience.

Posted by Shaun M. Thomas on 2015-11-13 at 17:49:43

A couple days ago, Robert Haas announced that he checked in the first iteration of parallel sequence scans in the Postgres 9.6 branch. And no, that’s not a typo. One of the great things about the Postgres devs is that they have a very regimented system of feature freezes to help ensure timely releases. Thus even though 9.5 just released its second beta, they’re already working on 9.6.

So what is a sequence scan, and why does this matter? Past articles have covered this, but Postgres uses sequence scans when it needs to read the entire contents of a table. For larger entities consisting of tens or hundreds of millions of rows, this is a time-consuming process. Even the best processor can only handle a few million rows per second, so as scale increases vertically, there’s currently no way to address these larger data volumes efficiently without significant effort.

As slow as a sequence scan is, it’s also a relatively simple process, and a great starting point for the long road to a parallel Postgres engine. Postgres knows how big the table files are and obviously knows its own storage format, so it merely needs to set scan ranges and farm them out to workers that report the results back to a coordinator process. In theory, that’s a clean transition that avoids complicated aggregation rules or clause pushdown headaches.

But how well does it work? To see this for myself, I did something I’ve actually never done before now: download the Postgres git tree. Then I set up a test case:

SELECT, repeat(' ', 20) AS junk
  FROM generate_series(1, 20000000) a(id);
ANALYZE test_tab;

With this test table, there are 20-million rows of empty junk and no indexes, so we’re effectively forcing Postgres to use a sequence scan for any query. Then we have to enable the feature with the max_parallel_degree parameter. And finally we invoke a query with a naive WHERE clause applied to every row so the engine has to actually do some work.

SET max_parallel_degree TO 1;

[continue reading]

We are pleased to announce the final agenda of keynote addresses for the first annual PGConf Silicon Valley conference. Keynotes will be delivered by Jay Kreps, CEO of Confluent, Ivan Novick, Greenplum Database Product Manager at Pivotal Software, and Umur Cubukcu, CEO and Co-founder of Citus Data. The conference is next week, November 17-18, 2015, at the South San Francisco Conference Center. Organized by Citus Data in cooperation with the San Francisco PostgreSQL Users Group, PGConf Silicon Valley is a technical conference aimed at the Silicon Valley PostgreSQL community and beyond.

This year's conference theme is "High Performance & Scalability". Breakout session tracks include DevOps, New Features, PostGIS, Tales from the Trenches, Hacking Postgres, and Data at Scale.

The Keynotes

On Wednesday, November 18, we will hear three keynote addresses that will launch a full day of breakout sessions:

  • Apache Kafka: A Commit Log for the Datacenter
    Jay Kreps, CEO of Confluent, will discuss what database logs, event data, stream processing and Franz Kafka have to do with each other.
  • "PostgreSQL License, Vendor Eco-systems, Cloud Providers, and Future of RDBMS
    Ivan Novick, Greenplum Database Product Manager at Pivotal Software, will explore how the permissive PostgreSQL license has created an ecosystem of vendors providing spin-off products based on the PostgreSQL database. He will cover the evolution of PostgreSQL innovations, ranging from databases for big data analytics to OLTP, as well as running them as shrink-wrapped software or in the cloud. He will also provide a vision for how these initiatives are converging to shape the future of relational databases.
  • Scaling Out PostgreSQL in the Distributed Era
    Umur Cubukcu, CEO and Co-founder of Citus Data.

Community Happy Hour

On Tuesday, November 17, following the day’s tutorial sessions, there will be a no host community happy hour at The Brass Elephant Lounge and Sports Bar. Promoted in conjunction with the San Francisco PostgreSQL Users Group (SFPUG), the lounge

[continue reading]

A client recently came to me with an ongoing mystery: A remote Postgres replica needed replaced, but repeatedly failed to run pg_basebackup. It would stop part way through every time, reporting something along the lines of:

pg_basebackup: could not read COPY data: SSL error: decryption failed or bad record mac

The first hunch we had was to turn off SSL renegotiation, as that isn't supported in some OpenSSL versions. By default it renegotiates keys after 512MB of traffic, and setting ssl_renegotiation_limit to 0 in postgresql.conf disables it. That helped pg_basebackup get much further along, but they were still seeing the process bail out before completion.

The client's Chef has a strange habit of removing my ssh key from the database master, so while that was being fixed I connected in and took a look at the replica. Two pg_basebackup runs later, a pattern started to emerge:
$ du -s 9.2/data.test*
67097452        9.2/data.test
67097428        9.2/data.test2
While also being a nearly identical size, those numbers are also suspiciously close to 64GB. I like round numbers, when a problem happens close to one that's often a pretty good tell of some boundary or limit. On a hunch that it wasn't a coincidence I checked around for any similar references and found a recent openssl package bug report:

RHEL 6, check. SSL connection, check. Failure at 64 GiB, check. And lastly, a connection with psql confirmed AES-GCM:
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)

Once the Postgres service could be restarted to load in the updated OpenSSL library, the base backup process completed without issue.

Remember, keep those packages updated!
Posted by Josh Berkus in pgExperts on 2015-11-13 at 04:45:00
First, in case you somehow missed it, PostgreSQL 9.5 Beta 2 is now out.  Time for another round of testing!  There's fewer and fewer bugs found, so we're not far from a final release.  I don't know about anyone else, but we're going into production on Beta 2 in a couple places.  Can't wait any longer for Upsert and RLS.

Second, pgConfSV has announced its keynotes, from CitusData, Pivotal, and -- as a bit of a surprise -- from streaming data thing Kafka (before you ask, I didn't pick the keynotes). I believe registration is still open, so you can still go to the conference next week if you act quickly.

Thirdly, I have a roundup of some cool blogs covering PostgreSQL which aren't on Planet and you may have missed: did a terrific two-part article on why PostgreSQL is the best open source database.  Read it here: Part I  Part II

They also covered using JSON in Postgres with Python.

In stranger news, there's an amazingly strange story about Soylent, PostgreSQL, and auto-responding to poisoning accusations.  Yes, Soylent Inc. uses PostgreSQL, why not?  Read the whole weird tale here on Zapier's blog.

That's it for my round up ... if you have some good links, post them in the comments.