PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
On 20th of March, Andres Freund committed patch: Use 128-bit math to accelerate some aggregation functions.   On platforms where we support 128bit integers, use them to implement faster transition functions for sum(int8), avg(int8), var_*(int2/int4),stdev_*(int2/int4). Where not supported continue to use numeric as a transition type.   In some synthetic benchmarks this has been shown […]

In the first part of this article we configured Vagrant to execute two Ubuntu 14.04 Trusty Tahr virtual machines, respectively called pg and backup. In this second part we will look at how to use Puppet to set up and configure a PostgreSQL server on pg and back it up via Barman from the backup box.


Puppet: configurationpuppet-barman-part-2-inside

After defining the machines as per the previous article, we need to specify the required Puppet modules that librarian-puppet will manage for us.

Two modules are required:

  1. puppetlabs/postgresql (https://github.com/puppetlabs/puppetlabs-postgresql/) to install PostgreSQL on the pg VM
  2. it2ndq/barman (https://github.com/2ndquadrant-it/puppet-barman) to install Barman on backup

Both modules will be installed from Puppet Forge. For the puppetlabs/postgresql module, we’ll have to use version 4.2.0 at most at the moment, as the latest version (4.3.0) is breaking the postgres_password parameter we’ll be using later (see this pull request). Let’s create a file called Puppetfile containing this content in the project directory:

forge "https://forgeapi.puppetlabs.com"
mod "puppetlabs/postgresql", "<4.3.0"
mod "it2ndq/barman"

We can now install the Puppet modules and their dependencies by running:

$ librarian-puppet install --verbose

Although not essential, it’s preferable to use the option --verbose every time librarian-puppet is used. Without it the command is very quiet and it’s useful to have details about what it’s doing in advance. For example, without using --verbose, you may find out that you’ve wasted precious time waiting for a dependency conflict to be resolved, only to see an error many minutes later.

Upon successful completion of the command, a modules directory containing the barman and postgresql modules and their dependencies (apt, concat, stdlib) will be created in our working directory. In addition, librarian-puppet will create the Puppetfile.lock file to identify dependencies and versions of the installed modules, pinning them to prevent future updates. This way, subseq

[continue reading]

Posted by Andreas Scherbaum on 2015-03-31 at 22:00:00
Author
Andreas 'ads' Scherbaum

During Nordic PGDay 2015, I attended Heikki's talk about "pg_rewind". Someone in the audience asked, if it's possible to roll a PostgreSQL database forward, ahead of the master. The answer was that currently it's not possible, but there is no technical reason why this should not work.

 


Continue reading "Reversing pg_rewind into the future"
Posted by Josh Berkus in pgExperts on 2015-03-31 at 21:24:00
Primary Keyvil was one of the most popular posts on my old blog.  Since the old block has become somewhat inaccessible, and I recently did my Keyvil lightning talk again at pgConf NYC, I thought I'd reprint it here, updated and consolidated.

Two actual conversations I had on IRC ages ago, handles changed to protect the ignorant, and edited for brevity (irc.freenode.net, channel #postgresql):


    newbie1: schema design:
      http://www.rafb.net/paste/results/Bk90sz89.html

    agliodbs: hmmm ... why do you have an ID column
        in "states"?  You're not using it.

    newbie1: because I have to.

    agliodbs: you what?

    newbie1: the ID column is required for normalization.

    agliodbs chokes




    newbie2: how do I write a query to remove the duplicate rows?

    agliodbs: please post your table definition

    newbie2: http://www.rafb.net/paste/results/Hk90fz88.html

    agliodbs: What's the key for "sessions"?

    newbie2: it has an "id" column

    agliodbs: Yes, but what's the real key? 
        Which columns determine a unique row?

    newbie2: I told you, the "id" column. 
        It's a primary key and everything.

    agliodbs: That's not going to help you 

        identify duplicate sessions. You need another
        key ... a unique constraint on real data columns, 
        not just an "id" column.

    newbie2: no I don't

    agliodbs: Good luck with your problem then.



The surrogate numeric key has been a necessary evil for as long as we've had SQL. It was set into SQL89 because the new SQL databases had to interact with older applications which expected "row numbers," and it continues because of poor vendor support for features like CASCADE.

Inevitably, practices which are "necessary evils" tend to become "pervasive evils" in the hands of the untrained and the lazy. Not realizing that ID columns are a pragmatic compromise with application performance, many frameworks and development pragmas have enshrined numeric IDs as part of the logical and theoretical model of their applicatio

[continue reading]

If you're using pgbouncer, you'll notice that you can't put a WHERE clause in, which makes tracing down some particular issue on a particular connection more painful than it needs to be.
Continue reading "Monitoring pgbouncer with pgbouncer_wrapper"
Posted by Robert Haas in EnterpriseDB on 2015-03-31 at 16:28:00
PostgreSQL has three shutdown modes: smart, fast, and immediate.  For many years, the default has been "smart", but Bruce Momjian has just committed a patch to change the default to "fast" for PostgreSQL 9.5.  In my opinion, this is a good thing; I have complained about the current, and agreed with others complaining about it, many times, at least as far back as December of 2010.  Fortunately, we now seem to have now achieved consensus on this change.

Read more »
I've blogged before about how the buildfarm client software can be useful for developers amd reviewers. Yesterday was a perfect example. I was testing a set of patches for a bug fix for pg_upgrade running on Windows, and they go all the way back to the 9.0 release. The simplest way to test these was using a buildfarm animal. On jacana, I applied the relevant patch in each branch repo, and then simply did this to build and test them all:

for f in root/[RH]* ; do 
br=`basename $f`
perl ./run_build.pl --from-source=`pwd`/$f/pgsql --config=jacana.conf --verbose $br
done

After it was all done and everything worked, I cleaned up the git repositories so they were ready for more buildfarm runs:

for f in root/[RH]* ; do 
pushd $f/pgsql
git reset --hard
git clean -dfxq
popd
done

Pretty simple! The commands are shown here on multiple lines for clarity, but in fact I wrote each set on one line, so after applying the patches the whole thing took 2 lines. (Because jacana only builds back to release 9.2, I had to repeat the process on frogmouth for 9.0 and 9.1, using the same process).

Posted by Christophe Pettus in pgExperts on 2015-03-29 at 02:42:19

The slides from my talk at PGConf US 2015 are now available.

Posted by Josh Berkus in pgExperts on 2015-03-28 at 18:34:00
I have a data analytics project which produces multiple statistical metrics for a large volume of sensor data.  This includes percentiles (like median and 90%) as well as average, min and max.  Originally this worked using PL/R, which was pretty good except that some of the R modules were crashy, which was not so great for uptime.

This is why, two years ago, I ripped out all of the PL/R and replaced it with PL/Python and SciPy.  I love SciPy because it gives me everything I liked about R, without most of the things I didn't like.  But now, I've ripped out the SciPy as well.  What am I replacing it with?  Well, SQL.

In version 9.4, Andrew Gierth added support for percentiles to PostgreSQL via WITHIN GROUP aggregates. As far as I'm concerned, this is second only to JSONB in reasons to use 9.4.

Now, one of the more complicated uses I make of aggregates is doing "circular" aggregates, that is producing percentiles for a set of circular directions in an effort to determine the most common facings for certain devices.  Here's the PL/Python function I wrote for this, which calculates circular aggregates using the "largest gap" method.  This algorithm assumes that the heading measurements are essentially unordered, so to find the endpoints of the arc we look for two measurements which are the furthest apart on the circle.  This means shifting the measurements to an imaginary coordinate system where the edge of this gap is the low measurement, calculating percentiles, and then shifting it back.  Note that this method produces garbage if the device turned around a complete circle during the aggregate period.

Now, that SciPy function was pretty good and we used it for quite a while.  But we were unhappy with two things: first, SciPy is rather painful as a dependency because the packaging for it is terrible; second, having PostgreSQL call out to SciPy for each iteration isn't all that efficient.

So, since 9.4 has percentiles now, I started writing a function based the built-in SQL percentiles.  Initially I was thinking

[continue reading]

High availability of PostgreSQL databases is incredibly important to me. You might even say it’s a special interest of mine. It’s one reason I’m both excited and saddened by a feature introduced in 9.4. I’m Excited because it’s a feature I plan to make extensive use of, and saddened because it has flown under the radar thus far. It’s not even listed in the What’s new in PostgreSQL 9.4 Wiki page. If they’ll let me, I may have to rectify that.

What is this mysterious change that has me drooling all over my keyboard? The new recovery_min_apply_delay standby server setting. In name and intent, it forces a standby server to delay application of upstream changes. The implications, however, are much, much more important.

Let me tell you a story; it’s not long, actually. A couple years ago, I had to help a client that was using a hilariously over-engineered stack to prevent data loss. I only say that because at first glance, the number of layers and duplicate servers would shock most people, and the expense would finish the job. This was one of my full recommended stacks, plus a few extra bits for the truly paranoid. DRBD-bonded servers, Pacemaker failover, off-site disaster recovery streaming clones, nightly backup, off-site backup and historical WAL storage, and long-term tape archival in a vault for up to seven years. You would need to firebomb several cities to get rid of this data.

But data permanence and availability are not synonymous. All it took was a single misbehaving CPU to take out the entire constellation of database servers, and corrupt a bunch of recent WAL files for good measure. How this is possible, and how difficult it is to avoid, is a natural extension of using live streaming replicas for availability purposes. We always need to consider one important factor: immediacy applies to everything.

Here’s what actually happened:

  1. A CPU on master-1 went bad.
  2. Data being written to database files was corrupt.
  3. DRBD copied the bad blocks, immediately corrupting master-2.
  4. Shared memory became corrupt.
  5. Streamin

[continue reading]

This week the following commit has landed in PostgreSQL code tree, introducing a new feature that will be released in 9.5:

commit: cb1ca4d800621dcae67ca6c799006de99fa4f0a5
author: Tom Lane <tgl@sss.pgh.pa.us>
date: Sun, 22 Mar 2015 13:53:11 -0400
Allow foreign tables to participate in inheritance.

Foreign tables can now be inheritance children, or parents.  Much of the
system was already ready for this, but we had to fix a few things of
course, mostly in the area of planner and executor handling of row locks.

[...]

Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro
Horiguchi, some additional hacking by me

As mentioned in the commit message, foreign tables can now be part of an inheritance tree, be it as a parent or as a child.

Well, seeing this commit, one word comes immediately in mind: in-core sharding. And this feature opens such possibilities with for example a parent table managing locally a partition of foreign child tables located on a set of foreign servers.

PostgreSQL offers some way to already do partitioning by using CHECK constraints (non-intuitive system but there may be improvements in a close future in this area). Now combined with the feature committed, here is a small example of how to do sharding without the need of any external plugin or tools, only postgres_fdw being needed to define foreign tables.

Now let's take the example of 3 Postgres servers, running on the same machine for simplicity, using ports 5432, 5433 and 5434. 5432 will hold a parent table, that has two child tables, the two being foreign tables, located on servers listening at 5433 and 5434. The test case is simple: a log table partitioned by year.

First on the foreign servers, let's create the child tables. Here it is for the table on server 5433:

=# CREATE TABLE log_entry_y2014(log_time timestamp,
       entry text,
       check (date(log_time) >= '2014-01-01' AND
              date(log_time) < '2015-01-01'));
CREATE TABLE

And the second one on 5434:

=# CREATE TABLE log_entry_y2015(log_tim

[continue reading]

Posted by Josh Berkus on 2015-03-26 at 19:51:28

On November 18th, 2015, we will have an independent, multi-track conference all about high performance PostgreSQL: pgConf SV. This conference is being organized by CitusData at the South San Francisco Convention Center. Stay tuned for call for presentations, sponsorships, and more details soon.

Posted by David Fetter in Farmers Business Network on 2015-03-26 at 13:27:51
SQL is code.

This may seem like a simple idea, but out in the wild, you will find an awful lot of SQL programs which consist of a single line, which makes them challenging to debug.

Getting it into a format where debugging was reasonably easy used to be tedious and time-consuming, but no more!
Continue reading "Formatting!"
Posted by Peter Eisentraut on 2015-03-26 at 00:00:00

PgBouncer has a virtual database called pgbouncer. If you connect to that you can run special SQL-like commands, for example

$ psql -p 6432 pgbouncer
=# SHOW pools;
┌─[ RECORD 1 ]───────────┐
│ database   │ pgbouncer │
│ user       │ pgbouncer │
│ cl_active  │ 1         │
│ cl_waiting │ 0         │
│ sv_active  │ 0         │
│ sv_idle    │ 0         │
│ sv_used    │ 0         │
│ sv_tested  │ 0         │
│ sv_login   │ 0         │
│ maxwait    │ 0         │
└────────────┴───────────┘

This is quite nice, but unfortunately, you cannot run full SQL queries against that data. So you couldn’t do something like

SELECT * FROM pgbouncer.pools WHERE maxwait > 0;

Well, here is a way: From a regular PostgreSQL database, connect to PgBouncer using dblink. For each SHOW command provided by PgBouncer, create a view. Then that SQL query actually works.

But before you start doing that, I have already done that here:

[continue reading]

Posted by Rajeev Rastogi on 2015-03-25 at 05:00:00
In PostgreSQL 9.5, we can see improved performance  for Index Scan on ">" condition.

In order to explain this optimization, consider the below schema:
create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
                select count(*) from tbl2 where id2>'a' and id3>990000;

As per design prior to this patch, Above query used following steps to retrieve index tuples:

  • Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'
  • Then it fetches each tuples from position found in step-1.
  • For each tuple, it matches all scan key condition, in our example it matches both scan key condition.
  • If condition match, it returns the tuple otherwise scan stops.


Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).

So comparison on first scan key condition again in step-3 seems to be redundant.

So we have made the changes in BTree scan algorithm to avoid the redundant check i.e. remove the first key comparison for each tuple as it is guaranteed to be always true.

Performance result summary:



I would like to thanks Simon Riggs for verifying and committing this patch. Simon Riggs also confirmed improvement of 5% in both short and long index, on the least beneficial data-type and considered to be very positive win overall. 
Posted by Josh Berkus on 2015-03-24 at 19:30:10

main-image

On March 10th, we had our third ever pgDay for SFPUG, which was a runaway success. pgDaySF 2015 was held together with FOSS4G-NA and EclipseCon; we were especially keen to join FOSS4G because of the large number of PostGIS users attending the event. In all, around 130 DBAs, developers and geo geeks joined us for pgDay SF ... so many that the conference had to reconfigure the room to add more seating!

standing room only

The day started out with Daniel Caldwell showing how to use PostGIS for offline mobile data, including a phone demo.

Daniel Caldwell setting up

Ozgun Erdogan presented pg_shard with a a short demo.

Ozgun presents pg_shard with PostGIS

Gianni Ciolli flew all the way from London to talk about using Postgres' new Logical Decoding feature for database auditing.

Gianni Ciolli presenting

Peak excitement of the day was Paul Ramsey's "PostGIS Feature Frenzy" presentation.

Paul Ramsey making quotes

We also had presentations by Mark Wong and Bruce Momjian, and lightning talks by several presenters. Slides for some sessions are available on the FOSS4G web site. According to FOSS4G, videos will be available sometime soon.

Of course, we couldn't have done it without our sponsors: Google, EnterpriseDB, 2ndQuadrant, CitusDB and pgExperts. So a big thank you to our sponsors, our speakers, and the staff of FOSS4G-NA for creating a great day.

Last Thursday, I had this short and one-sided conversation with myself: “Oh, cool, Pg 9.4 is out for RDS. I’ll upgrade my new database before I have to put it into production next week, because who knows when else I’ll get a chance. Even though I can’t use pg_dumpall, this will take me what, 20 […]
Posted by Heikki Linnakangas on 2015-03-23 at 20:05:31

Before PostgreSQL got streaming replication, back in version 9.0, people kept asking when we’re going to get replication. That was a common conversation-starter when standing at a conference booth. I don’t hear that anymore, but this dialogue still happens every now and then:

- I have streaming replication set up, with a master and standby. How do I perform failover?
- That’s easy, just kill the old master node, and run “pg_ctl promote” on the standby.
- Cool. And how do I fail back to the old master?
- Umm, well, you have to take a new base backup from the new master, and re-build the node from scratch..
- Huh, what?!?

pg_rewind is a better answer to that. One way to think of it is that it’s like rsync on steroids. Like rsync, it copies files that differ between the source and target. The trick is in how it determines which files have changed. Rsync compares timestamps, file sizes and checksums, but pg_rewind understands the PostgreSQL file formats, and reads the WAL to get that information instead.

I started hacking on pg_rewind about a year ago, while working for VMware. I got it working, but it was a bit of a pain to maintain. Michael Paquier helped to keep it up-to-date, whenever upstream changes in PostgreSQL broke it. A big pain was that it has to scan the WAL, and understand all different WAL record types – miss even one and you might end up with a corrupt database. I made big changes to the way WAL-logging works in 9.5, to make that easier. All WAL record types now contain enough information to know what block it applies to, in a common format. That slashed the amount of code required in pg_rewind, and made it a lot easier to maintain.

I have just committed pg_rewind into the PostgreSQL git repository, and it will be included in the upcoming 9.5 version. I always intended pg_rewind to be included in PostgreSQL itself; I started it as a standalone project to be able to develop it faster, outside the PostgreSQL release cycle, so I’m glad it finally made it into the main distribution now. Please give it a l

[continue reading]

Posted by Marco Slot in CitusData on 2015-03-23 at 12:53:12
Posted by Umair Shahid on 2015-03-23 at 12:36:53

 

Developers have been really excited about the addition of JSON support starting PostgreSQL v9.2. They feel they now have the flexibility to work with a schema-less unstructured dataset while staying within a relational DBMS. So what’s the buzz all about? Let’s explore below …

Why is NoSQL so attractive?

Rapid turnaround time … it is as simple as that. With the push to decrease time-to-market, developers are under constant pressure to turn POCs around very quickly. It is actually not just POCs, marketable products are increasingly getting the same treatment. The attitude is, “If I don’t get it out, someone else will.”.

Any decent sized application will need to store data somewhere. Rather than going through the pains of designing schemas and the debates on whether to normalize or not, developers just want to get to the next step. That’s how databases like MongoDB gained such tremendous popularity. They allow for schema-less, unstructured data to be inserted in document form and the developers find it easy to convert class objects within their code into that document directly.

There is a trade-off, however. The document (and key/value store) databases are very unfriendly to relations. While retrieving data, you will have a very hard time cross referencing between different tables making analytics nearly impossible. And, nightmare of nightmares for mission critical applications, these databases are not ACID compliant.

In walks PostgreSQL with JSON and HSTORE support.

NoSQL in PostgreSQL

While the HSTORE contrib module has been providing key/value data types in standard PostgreSQL table columns since v8.2, the introduction of native JSON support in v9.2 paves way for the true power of NoSQL within PostgreSQL.

Starting v9.3, not only do you have the ability to declare JSON data types in standard tables, you now have functions to encode data to JSON format and also to extract data elements from a JSON column. What’s more, you can also interchange data between JSON and HSTORE using simple & intuitive fun

[continue reading]

Posted by Paul Ramsey on 2015-03-21 at 16:16:00

I did a new PostGIS talk for FOSS4G North America 2015, an exploration of some of the tidbits I've learned over the past six months about using PostgreSQL and PostGIS together to make "magic" (any sufficiently advanced technology...)

 

Posted by Paul Ramsey on 2015-03-20 at 20:07:00

Somehow I've gotten through 10 years of SQL without ever learning this construction, which I found while proof-reading a colleague's blog post and looked so unlikely that I had to test it before I believed it actually worked. Just goes to show, there's always something new to learn.

Suppose you have a GPS location table:

  • gps_id: integer
  • geom: geometry
  • gps_time: timestamp
  • gps_track_id: integer

You can get a correct set of lines from this collection of points with just this SQL:


SELECT
gps_track_id,
ST_MakeLine(geom ORDER BY gps_time ASC) AS geom
FROM gps_poinst
GROUP BY gps_track_id

Those of you who already knew about placing ORDER BY within an aggregate function are going "duh", and the rest of you are, like me, going "whaaaaaa?"

Prior to this, I would solve this problem by ordering all the groups in a CTE or sub-query first, and only then pass them to the aggregate make-line function. This, is, so, much, nicer.

PostgreSQL has provided table partitions for a long time. In fact, one might say it has always had partitioning. The functionality and performance of table inheritance has increased over the years, and there are innumerable arguments for using it, especially for larger tables consisting of hundreds of millions of rows. So I want to discuss a quirk that often catches developers off guard. In fact, it can render partitioning almost useless or counter-productive.

PostgreSQL has a very good overview in its partitioning documentation. And the pg_partman extension at PGXN follows the standard partitioning model to automate many of the pesky tasks for maintaining several aspects of partitioning. With modules like this, there’s no need to manually manage new partitions, constraint maintenance, or even some aspects of data movement and archival.

However, existing partition sets exist, and not everyone knows about extensions like this, or have developed in-house systems instead. Here’s something I encountered recently:

CREATE TABLE sys_order
(
    order_id     SERIAL       PRIMARY KEY,
    product_id   INT          NOT NULL,
    item_count   INT          NOT NULL,
    order_dt     TIMESTAMPTZ  NOT NULL DEFAULT now()
);

CREATE TABLE sys_order_part_201502 ()
       INHERITS (sys_order);

ALTER TABLE sys_order_part_201502
  ADD CONSTRAINT chk_order_part_201502
      CHECK (order_dt >= '2015-02-01'::DATE AND
             order_dt < '2015-02-01'::DATE + INTERVAL '1 mon');

This looks innocuous enough, but PostgreSQL veterans are already shaking their heads. The documentation alludes to how this could be a problem:

Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don’t need to be visited.

The issue in this case, is that adding the interval of a month changes the right boundary of this range constraint into a dynamic value. PostgreSQL will not use dynamic values in evaluating check constraints. Here’s a query plan from PostgreSQL 9.4.1, which is the most recent release as of

[continue reading]

The BDR team has recently introduced support for dynamically adding new nodes to a BDR group from SQL into the current development builds. Now no configuration file changes are required to add nodes and there’s no need to restart the existing or newly joining nodes.

This change does not appear in the current 0.8.0 stable release; it’ll land in 0.9.0 when that’s released, and can be found in the bdr-plugin/next branch in the mean time.

New nodes negotiate with the existing nodes for permission to join. Soon they’ll be able to the group without disrupting any DDL locking, global sequence voting, etc.

There’s also an easy node removal process so you don’t need to modify internal catalog tables and manually remove slots to drop a node anymore.

New node join process

With this change, the long-standing GUC-based configuration for BDR has been removed. bdr.connections no longer exists and you no longer configure connections with bdr.[conname]_dsn etc.

Instead, node addition is accomplished with the bdr.bdr_group_join(...) function. Because this is a function in the bdr extension, you must first CREATE EXTENSION bdr;. PostgreSQL doesn’t have extension dependencies and the bdr extension requires the btree_gist extension so you’ll have to CREATE EXTENSION btree_gist first.

Creating the first node

Creation of the first node must now be done explicitly using bdr.bdr_group_create. This promotes a standalone PostgreSQL database to a single-node BDR group, allowing other nodes to then be joined to it.

You must pass a node name and a valid externally-reachable connection string for the dsn parameter, e.g.:

CREATE EXTENSION btree_gist;

CREATE EXTENSION bdr;

SELECT bdr.bdr_group_join(
  local_node_name = 'node1',
  node_external_dsn := 'host=node1 dbname=mydb'
);

Note that the dsn is not used by the root node its self. It’s used by other nodes to connect to the root node, so you can’t use a dsn like host=localhost dbname=mydb if you intend to have nodes on multiple machines.

Adding other nodes

You can now join other nodes to f

[continue reading]

Posted by Paul Ramsey on 2015-03-20 at 00:00:00

The 2.1.6 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.6 release. As befits a patch release, the focus is on bugs, breakages, and performance issues. Users with large tables of points will want to priorize this patch, for substantial (~50%) disk space savings.

http://download.osgeo.org/postgis/source/postgis-2.1.6.tar.gz

Continue Reading by clicking title hyperlink ..
Posted by Josh Berkus in pgExperts on 2015-03-19 at 20:55:00
Since the folks at Tilt.com aren't on Planet Postgres, I thought I'd link their recent blog post on cool data migration hacks.  Tilt is a YCombinator company, and a SFPUG supporter.
Posted by Jason Petersen in CitusData on 2015-03-19 at 19:53:36

Last winter, we open-sourced pg_shard, a transparent sharding extension for PostgreSQL. It brought straightforward sharding capabilities to PostgreSQL, allowing tables and queries to be distributed across any number of servers.

Today we’re excited to announce the next release of pg_shard. The changes in this release include:

  • Improved performaceINSERT commands run up to four times faster
  • Shard repair — Easily bring inactive placements back up to speed
  • Copy script — Quickly import data from CSV and other files from the command line
  • CitusDB integration — Expose pg_shard’s metadata for CitusDB’s use
  • Resource improvements — Execute larger queries than ever before

For more information about recent changes, you can view all the issues closed during this release cycle on GitHub.

Upgrading or installing is a breeze: see pg_shard’s GitHub page for detailed instructions.

Whether you want a distributed document store alongside your normal PostgreSQL tables or need the extra computational power afforded by a sharded cluster, pg_shard can help. We continue to grow pg_shard’s capabilities and are open to feature requests.

Got questions?

If you have any questions about pg_shard, please contact us using the pg_shard-users mailing list.

If you discover an issue when using pg_shard, please submit it to our issue tracker on GitHub.

Further information is available on our website, where you are free to contact us with any general questions you may have.

A nice feature extending the usage of pgbench, in-core tool of Postgres aimed at doing benchmarks, has landed in 9.5 with this commit:

commit: 878fdcb843e087cc1cdeadc987d6ef55202ddd04
author: Robert Haas <rhaas@postgresql.org>
date: Mon, 2 Mar 2015 14:21:41 -0500
pgbench: Add a real expression syntax to \set

Previously, you could do \set variable operand1 operator operand2, but
nothing more complicated.  Now, you can \set variable expression, which
makes it much simpler to do multi-step calculations here.  This also
adds support for the modulo operator (%), with the same semantics as in
C.

Robert Haas and Fabien Coelho, reviewed by Álvaro Herrera and
Stephen Frost

pgbench has for ages support for custom input files using -f with custom variables, variables that can be set with for example \set or \setrandom, and then can be used in a custom set of SQL queries:

\set id 10 * :scale
\setrandom id2 1 :id
SELECT name, email FROM users WHERE id = :id;
SELECT capital, country FROM world_cities WHERE id = :id2;

Up to 9.4, those custom variables can be calculated with simple rules of the type "var operator var2" (the commit message above is explicit enough), resulting in many intermediate steps and variables when doing more complicated calculations (note as well that additional operands and variables, if provided, are simply ignored after the first three ones):

\setrandom ramp 1 200
\set scale_big :scale * 10
\set min_big_scale :scale_big + :ramp
SELECT :min_big_scale;

In 9.5, such cases become much easier because pgbench has been integrated with a parser for complicated expressions. In the case of what is written above, the same calculation can be done more simply with that, but far more fancy things can be done:

\setrandom ramp 1 200
\set min_big_scale :scale * 10 + :ramp
SELECT :min_big_scale;

With pgbench run for a couple of transactions, here is what you could get:

$ pgbench -f test.sql -t 5
[...]
$ tail -n5 $PGDATA/pg_log/postgresql.log
LOG:  statement: SELECT 157;
LOG:  statement: SELECT 53;
LOG

[continue reading]

Amit Kapila and I have been working very hard to make parallel sequential scan ready to commit to PostgreSQL 9.5.  It is not all there yet, but we are making very good progress.  I'm very grateful to everyone in the PostgreSQL community who has helped us with review and testing, and I hope that more people will join the effort.  Getting a feature of this size and complexity completed is obviously a huge undertaking, and a significant amount of work remains to be done.  Not a whole lot of brand-new code remains to be written, I hope, but there are known issues with the existing patches where we need to improve the code, and I'm sure there are also bugs we haven't found yet.
Read more »
On 18th of March, Alvaro Herrera committed patch: array_offset() and array_offsets()   These functions return the offset position or positions of a value in an array.   Author: Pavel Stěhule Reviewed by: Jim Nasby It's been a while since my last “waiting for" post – mostly because while there is a lot of work happening, […]