PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
TOMORROW! Wednesday Feb 10th, 7PM at Sun. I'll be presenting on all the new stuff in Postgres 9.0. RSVP to austinpug@postgresql.org appreciated.
Posted by Andreas Kretschmer on 2010-02-09 at 07:29:42
Hi all,

I'm back from FOSDEM, Brussels. It was a great and impressive event and i meet there a lot of PostgreSQL-people. The talks given in our devroom were imposing, i have learned a LOT about our upcoming new release 9.0, in particular streaming replication, exclusion constraints and (with thanks to the talk given from Sergey Petrunya about MariaDB) join removal (the MariaDB calls this feature "table elimination" and after his talk i have discussed that feature with Stefan 'mastermind' Kaltenbrunner. He informed me that we have this feature in 9.0, but we called it 'join removal').


Okay, i don't want peeve you with my really poor english. Last but not least, i have made a lot of pictures. You can see almost all pictures here:

http://a-kretschmer.de/bruessel/

Not all pictures are in a reasonable quality, sorry. The pictures are reduced to 50%, if someone want individual pictures in full resolution, just ask me.


I'm looking forward to see you in Amsterdam...
Posted by Josh Berkus in pgExperts on 2010-02-09 at 00:02:21
Given that they occur on the same continent 2 months apart, many people are confused by PostgreSQL East vs. pgCon. Let me run down the comparison so that people can be clear on what each offers (and go to both, of course!):

I explored two interesting topics today while learning more about Postgres.

Partial page writes

PostgreSQL’s partial page write protection is configured by the following setting, which defaults to “on”:

full_page_writes (boolean)

When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint… Storing the full page image guarantees that the page can be correctly restored, but at a price in increasing the amount of data that must be written to WAL. (Because WAL replay always starts from a checkpoint, it is sufficient to do this during the first change of each page after a checkpoint. Therefore, one way to reduce the cost of full-page writes is to increase the checkpoint interval parameters.)

Trying to reduce the cost of full-page writes by increasing the checkpoint interval highlights a compromise. If you decrease the interval, then you’ll be writing full pages to the WAL quite often. This should in theory lead to surges in the number of bytes written to the WAL, immediately following each checkpoint. As pages are revisited over time for further changes, the number of bytes written should taper off gradually until the next checkpoint. Hopefully someone who knows more can confirm this. Does anyone graph the number of bytes written to their WAL? That would be a nice illustration to see how dramatic this surging is.

Decreasing the checkpoint interval seems a bit scary, and is bound to have its own costs, for all the usual reasons. A massive checkpoint once in a while should be really expensive, and would lead to a bad worst-case time for recovery. Does the new bgwriter implementation in 8.3 fix any of this? In theory it could, but I don’t know enough yet to say. I have heard conflicting opinions on this point. I have a lot more to read about it before I form my own opinion.

Storing full pages might not really be that expensive. It could bloat the WAL, but is the cost (in terms of time) really that high? InnoDB (in MySQ

[continue reading]

A while back I submitted a couple of talks for PG-East 2010 in Philly, and over the past couple of weeks I've been nagging the organizers semi-frequently to get some pre-info on whether I've been accepted or not, since flight prices started to climb fairly rapidly. The site clearly says information that the information will be available on Feb 15th, so I can't really complain that the answer kept being "don't know yet".

A couple of days ago, I got a note from Dave pinged me with a message asking if I was approved. Turns out this press-release had been posted (by his company, no less). Which explicitly names me as a speaker at the conference.

Took me two more days of chasing down JD, but I now have confirmation I'll be there. I don't actually know what I'll be speaking about, but it's a pretty safe bet it will be PostgreSQL related.

I call this Time management by press releases. If I could only get it to apply to all meetings, I would no longer need to keep my own calendar up to date.

So, I'll see you in Philly!

Posted by Dave Page in EnterpriseDB on 2010-02-08 at 15:31:38
Well, FOSDEM 2010 is now over and it all seemed to go pretty well. The PostgreSQL Project was represented well (as one would hope for our second biggest annual European gathering), with the majority of the Hotel Agenda Louise seemingly occupied by database geeks.

On the Friday night we had a database dinner at Les Brasseurs de la Grand Place where we were joined by Sergey Petrunya and Kristian Nielsen from MariaDB. Good conversation, good beer and good food followed later by an aborted attempt to join the FOSDEM Beer Event at Cafe Delirium (it was just too busy) and a successful landing at the Irish pub a short walk form the Hotel.


Saturday morning was the start of the conference itself. We'd hoped to scrounge an extra table (much needed, with the number of people and the amount of swag we had), but unfortunately that didn't work out. Somehow, we managed to squeeze onto one.


The talks started at 1PM in our dev room (which we only had for Saturday afternoon unfortunately). Magnus and Jean-Paul started with report on the state of PostgreSQL Europe, and then it was my turn with my talk on "Developments in PostgreSQL 9.0". Slides and the full schedule can be found on the PostgreSQL Wiki. The dev room was basically packed solid with people sitting on the floor for every talk. Please FOSDEM organisers - give us a bigger room next year!


Saturday night was a late dinner at possibly the slowest restaurant in the world, who clearly weren't expecting too many diners. Food wasn't too bad though, when it finally arrived.

Sunday was spent mostly on the booth and (in my case) in lots of ad-hoc meetings on topics such as the new PostgreSQL project infrastructure thats in development, as well as PGDay.EU 2010 which is being planned in Paris.

Most of us left at around 5PM, with Heikki, Greg, Magnus, Stefan and I heading for the airport. After pizza and a brief scare when Stefan lost his boarding pass, we left for home.
Posted by Robert Gravsjö on 2010-02-08 at 08:24:54
A little off topic but I'm very pleased to see that the timetable for Django 1.2 seems to be holding since they just entered feature freeze with beta 1. Among my favorite of features to come is, of course, support for multiple database backends. Other ...
Posted by Andreas Scherbaum on 2010-02-07 at 22:55:00
Author
Andreas 'ads' Scherbaum
The Open Source people have met again, like every year, in Brussels for the biggest open source event in europe - FOSDEM. The European PostgreSQL User Group was presenta booth and a devroom, plus a lot "community".


Continue reading "PostgreSQL @ FOSDEM 2010"
Posted by Josh Berkus in pgExperts on 2010-02-07 at 15:17:07
Per discussion on the pgsql-hackers mailing list this month, the PostgreSQL database really needs a new (or dramatically improved) Python driver. I have been only too aware of this due to the amount of Django work my team is doing lately. Let me sum up the situation so that you don't have to comb all the way through that thread. Hopefully at least one person reading this will take it as a call to action.
Posted by David Fetter in pgExperts on 2010-02-07 at 09:32:04
This week's PostgreSQL Weekly news may be rescheduled to Wednesday February 10, evening PST.

Thanks very much to whatever component failed at my home for this.

PostGIS 1.5.0 is finally out

I'm happy to report that after a long haul, we have finally released PostGIS 1.5.0. Two months late, but there it is, and its a really great release I think; Perhaps the best release ever.

Summary excerpted from Paul's slightly premature announcement

February 4, 2010

The PostGIS development team has, after a long course of reflection
and a detailed self-examination of our various personal failings,
decided to release PostGIS 1.5.0 to the public.

  http://postgis.org/download/postgis-1.5.0.tar.gz

This new version of PostGIS includes a new "geography" type for
managing geodetic (lat/lon) data, performance-enhanced distance
calculations, GML and KML format readers, an improved shape loading
GUI, and other new features as well.

Especial thanks to:

* Dave Skea for algorithms and mathematics necessary to support
spherical geometry
* Nicklas Avén for the new performance enhanced distance calculations
and other distance-related functions
* Sandro Santilli for new buffering features (end caps and style options)
* Olivier Courtin for GML/KML input functions
* Guillaume Lelarge for support for the upcoming PgSQL 9.0
* George Silva for an example implementation of history tables
* Vincent Picavet for Hausdorff distance calculations
* The maintainers of GEOS, Proj4, and LibXML, without whom we would
have less of a spatial database

Love, the PostGIS project steering committee,

Mark Cave-Ayland
Kevin Neufeld
Regina Obe
Paul Ramsey

Continue reading "PostGIS 1.5.0 out and PLR working on Windows 8.3-8.4 installs"
Posted by Dan Langille on 2010-02-06 at 18:04:02
I have about 14,000 data points collected from backups. I started in Oct 2007, but have not done anything with the data. I’d like your help. I have put the data into a PostgreSQL database and dumped it. The dump is here: http://langille.org/tmp/dlt_stats.sql.gz The table looks like: $ psql dan psql (8.4.2) Type "help" for help. dan=# \d dlt_stats [...]
Posted by Andrew Dunstan in pgExperts on 2010-02-06 at 14:33:29
The recent debate Bruce Momjian has started on the confusing mess that is the Python drivers for PostgreSQL made me feel very glad I just don't use any of them. As Just Another Perl Hacker, I am so glad that we have a good, well defined and fairly stable database API for Perl, namely DBI, and a pretty darn good implementation for PostgreSQL in DBD::Pg.

Add to these some of the higher end modules like DBIx::Class for use in frameworks and ORMs, and SQL::Translator for dealing with multiple database flavors, and Perl programmers doing database work are in pretty good shape.

Posted by Peter Eisentraut on 2010-02-05 at 23:19:22
My favorite PostgreSQL 9.0 feature does not have a two-letter acronym. It's the new bytea format, available since 8.5alpha1.

At F-Secure Labs, as you might imagine, we store information about a bunch of malware samples. Throughout the computer security industry, file samples, malware or perhaps not, are referred to by a hash value, such as MD5, SHA1, or SHA256. The typical representation of such hash values in most programming environments and also in prose and literature is the hexadecimal format, for example da39a3ee5e6b4b0d3255bfef95601890afd80709. Except when you want to write a test case against PostgreSQL or want to track down a problem, you'll be looking for \3329\243\356^kK\0152U\277\357\225`\030\220\257\330\007\011, also known as \\3329\\243\\356^kK\\0152U\\277\\357\\225`\\030\\220\\257\\330\\007\\011 in some contexts.

Well, that's over now; it will show as \xda39a3ee5e6b4b0d3255bfef95601890afd80709. You will still need to take care of the backslash, but that will surely be resolved when standard_conforming_strings is turned on in version 10.0 ;-), or we implement a new, SQL-standard conforming binary string type without legacy syntax issues.

By the way, the actual origin of this feature idea was a performance problem, reported by Bernd Helmle. The new format is quite a bit faster to encode and decode: In some internal tests, pg_dump of tables with mostly binary data was twice as fast and created a dump file that was half the size with the new format compared to the old format. So hopefully everyone wins.

Another new feature in the same area, by the way, is that PL/Python now supports bytea values sanely, contributed by Caleb Welton.

So when will PostgreSQL version 9.0 come out? I decided to "run the numbers" and take a look at how the Postgres project has done historically. Here's a quick graph showing the approximate number of days each major release since version 6.0 took:

Some interesting things can be seen here: there is a rough correlation between the complexity of a new release and the time it takes, major releases take longer, and the trend is gradually towards more days per release. Overall the project is doing great, releasing on average every 288 days since version 6. If we only look at version 7 and onwards, the releases are on average 367 days apart. If we look at *just* version 7, the average is 324 days. If we look at *just* version 8, the average is 410. Since the last major version that came out was on July 1, 2009, the numbers predict 9.0 will be released on July 3, 2010, based on the version 7 and 8 averages, and on August 15, 2010, based on just the version 8 averages. However, this upcoming version has two very major features, streaming replication (SR) and host standby (HS). How those will affect the release schedule remains to be seen, but I suspect the 9.0 to 9.1 window will be short indeed.

As a recap, the Postgres project only bumps the first part of the version number for major changes (Although many, myself included, would argue that 7.4 was such a major jump it should have been called 8.0). The second number occurs anytime a "new release" happens, and means new features and enhancements. The final number, the revision, is only incremented for security and bug fixes, and is almost always a 100% binary compatible drop in for the previous revision in the branch. (What's the average (mean) days between revisions? 84 days since version 6, and 88 days since version 7. The medians are 84 and 87 respectively)

How busy were those periods? Here's the number of commits per release period. Note that I said release period, not release, as commits are still being made to old branches, although this is a very small minority of

[continue reading]

With the talk about PostgreSQL 9.0 alpha 5, I thought it is time for me to try out another CVS head build on OpenSolaris. Of course this time this was on my home desktop which runs OpenSolaris 2009.06
I wanted to download the CVS head. The instructions are there on the wiki page. However before following it I needed to install the CVS on my OpenSolaris instance.

#pkg install SUNWcvs

The using the instructions I created a copy of the cvs repository and created my own project workspace.

I already had the Sun Compilers on my setup. (If not or have an old copy then you can always install or upgrade it as
# pkg install sunstudio
)
So I started my task of creating the new binaries on OpenSolaris. I found it to be bit bumpy.
Here is my configure options with my standard options.
$ ./configure --prefix=$HOME/project CFLAGS="-xO3 -xarch=native -xspace -W0,-Lt -W2,-Rcond_elim -Xa  -xildoff -xc99=none -xCC" --without-readline

However then I hit my first problem to do make.
Need to do using GNU make which was not installed on my desktop.
Back to pkg manager

# pkg install SUNWgmake

Continuing again with make which proceeded and then eventually stopped again due to missing bison. (I wonder why the "configure" script did  not catch that?)

# pkg install SUNWbison

Anyway I started to run make again now that I have installed bison. Strangely it failed again.
Figured it still did not find bison. I had to use ./configure statement again and tried gmake after that which allowed gmake to pick up bison.

However it failed again.

gmake[3]: Entering directory `/export/home/postgres/project/pgsql.project/src/backend/parser'
/usr/bin/bison -d  -o gram.c gram.y
gmake[3]: *** [gram.c] Broken Pipe
gmake[3]: Leaving directory `/export/home/postgres/project/pgsql.project/src/backend/parser'
gmake[2]: *** [parser/gram.h] Error 2

This one was not easy to solve. I thought that probably the bison was buggy and was about to give up. Then I thought I will give it a shot using truss to figure out what is happening

$ truss -f /usr/bin/biso

[continue reading]

Posted by David Fetter in pgExperts on 2010-02-05 at 07:10:07
The herd of blue elephants is gathering for FOSDEM, including several talks in the main track.

I'd like to thank the FOSDEM for making it possible for me to attend this year.

More on this soon...
Posted by Selena Deckelmann in EndPoint on 2010-02-04 at 17:44:29

I had the pleasure of attending and presenting at LinuxConf.AU this year in Wellington, NZ. Linux Conf.AU is an institution whose friendliness and focus on the practical business of creating and sustaining open source projects was truly inspirational.

My talk this year was "A Survey of Open Source Databases", where I actually created a survey and asked over 35 open source database projects to respond. I have received about 15 responses so far, and also did my own research on the over 50 projects I identified. I created a place-holder site for my research at: ossdbsurvey.org. I'm hoping to revise the survey (make it shorter!!) and get more projects to provide information.

Ultimately, I'd like the site to be a central location for finding information and comparing different projects. Performance of each is a huge issue, and there are a lot of individuals constructing good (and bad) systems for comparing. I don't think I want to dive into that pool, yet. But I would like to start collecting the work others have done in a central place. Right now it is really far too difficult to find all of this information.

Part of the talk was also a foray into the dangerous world of classification. I tried to put together basic categories, based on conversations with individual developers and some fine-tuning with Josh Berkus. Josh gave a short overview of database models during "Relational vs Non-relational" in the Data Storage mini-conf, and we collaborated some on category definition. I also saw Devdas Bhagat give a use case talk on using Postgres, yet again confirming how wonderful transactional DDL is for developers. I also gave a lightning talk (WITHOUT SLIDES!) on Bucardo at the tail end of the Data Storage mini-conf.

Josh Berkus, during "PostgreSQL Development Today", announced to the world that the new version of Postgres would be version 9.0! And he did a live demonstration of streaming replication and hot standby. The audience seemed pleased.

I was delighted to see representatives from the Postgres community on the m

[continue reading]

Posted by Bruce Momjian in EnterpriseDB on 2010-02-04 at 16:15:00

Many computer programmers are introverts because they have chosen the solitary job of creating programs that accomplish specific tasks. Open source developers are probably even more prone to be introverts because the (boring?) meetings, office chit-chat, and lunch outings are mostly gone, with the only communication being via email, instant messaging, twitter, and blogs. Open source social interaction is certainly more controlled than traditional communication.

Read More »

Posted by Simon Riggs in 2ndQuadrant on 2010-02-04 at 14:24:38
I recently returned from a lunch meeting of the UK ex-Teradatans to see old friends and colleagues. Some people know that I spent time with Teradata when it was in startup mode, what seems like a very long time ago now. Anyway, that's left me with good knowledge and interest in parallel database systems. And that's why I know Greenplum's Chief Architect Chuck McDevitt and hence why I've been using Greenplum on and off since 2005. Greenplum have also funded some of the developments I've done for PostgreSQL.

I'm disappointed we've not made much progress with parallel operations and partitioning in core Postgres in last few releases. Recent Greenplum results show we have much work to do in improving things. http://community.greenplum.com/showthread.php?t=113
Some people may think I should be sad at that, though the way I see it, Greenplum is very close to being PostgreSQL. It just happens to have some good performance enhancements of great use in Data Warehousing. A few other enhanced versions of Postgres exist also.

Some other recent results also show that MonetDB and Infobright don't fare any better by comparison either.
http://community.greenplum.com/showthread.php?t=111

Having seen the above results I'm thinking about projects for next release now. Anybody want to fund some additional Data Warehousing features in Postgres core? I'm determined that next release we will get Bitmap Indexes in core, at least.

There's some more to discuss on parallel query, such as "How does this all relate to Hot Standby?", so I'll follow up later with another blog.
Posted by Pavel Golub on 2010-02-04 at 07:31:09

This is a beta for the upcoming 1.2.10 release.

WineHQ compatibility improved. SQL Reverse Engineering improved. Support for 8.4.x SQL dialect added.

You’re welcome to download the latest release from our website at:
http://microolap.com/products/database/postgresql-designer/download/

Please don’t hesitate to ask any questions or report bugs with our Support Ticketing system.

Posted by Andrew Dunstan in pgExperts on 2010-02-02 at 23:17:39
Well, with some help from some comments on this blog and also from rereading the message from Aidan Van Dyk and understanding it a lot more this time, I think I have a handle on how the buildfarm should support Git repositories.

When the branch copy (which will be called [branchname]/pgsql, just as it is with CVS) exists, the buildfarm client will do an update by simply calling git pull. Where it does not exist, the client will create it by calling git clone followed by git checkout -t origin/[branchname] -b bf_[branchname].

This part is already working with my test buildfarm animal, in fact.

What I will add is support for an additional git_references config parameter, which can be used as part of the clone command. By default it will not be there, in which case the user would get a plain clone of the configured master repo. Setting up the references repo would be something documented but not done by the script.

One of the things that is mildly annoying from a developer perspective (it won't really affect buildfarm users) is that, as I understand it, when using Aidan's recommendation, getting updates into your working copy will require two commands rather than one (git fetch on your local master followed by git pull on your working copy). I guess that can be aliased, and it's something I will probably do pretty soon.
Posted by US PostgreSQL Association on 2010-02-02 at 18:38:59

JD Wrote:

Last West, a sign up sheet was sent out for people who were interested in starting a Seattle PostgreSQL group. I am please to say that this group has started. Their first meeting is February 9th, 2010. I have the honor of being their first speaker. I will be speaking on Dumb Simple PostgreSQL Performance, which is the same talk I have been giving at various user groups around Seattle (Python, Django, Perl). You can check out there website here.

I tend to write about new features in new versions of PostgreSQL, but this patch actually fixes one of the things that annoy me a lot, so here it goes: On 26th of January, Simon Riggs committed: Log Message: ----------- Fix longstanding gripe that we check for 0000000001.history at start of archive recovery, even when we know it is never [...]
Posted by Joshua Tolley in EndPoint on 2010-02-02 at 05:28:30

About a month ago, Bucardo added an interesting set of features in the form of a new script called slony_migrator.pl. In this post I'll describe slony_migrator.pl and its three major functions.

The Setup

For these examples, I'm using the pagila sample database along with a set of scripts I wrote and made available here. These scripts build two different Slony clusters. The first is a simple one, which replicates this database from a database called "pagila1" on one host to a database "pagila2" on another host. The second is more complex. Its one master node replicates the pagila database to two slave nodes, one of which replicates it again to a fourth slave using Slony's FORWARD function as described here. I implemented this setup on two FreeBSD virtual machines, known as myfreebsd and myfreebsd2. The reset-simple.sh and reset-complex.sh scripts in the script package I've linked to will build all the necessary databases from one pagila database and do all the Slony configuration.

Slony Synopsis

The slony_migrator.pl script has three possible actions, the first of which is to connect to a running Slony cluster and print a synopsis of the Slony setup it discovers. You can do this safely against a running, production Slony cluster; it gathers all its necessary information from a few simple Slony queries. Here's the synopsis the script writes for the simple configuration I described above:

josh@eddie:~/devel/bucardo/scripts$ ./slony_migrator.pl -db pagila1 -H myfreebsd
Slony version: 1.2.16
psql version: 8.3
Postgres version: 8.3.7
Slony schema: _pagila
Local node: 1
SET 1: All pagila tables
* Master node: 1  Active: Yes  PID: 3309  Comment: "Cluster node 1"
  (dbname=pagila1 host=myfreebsd user=postgres)
  ** Slave node:  2  Active: Yes  Forward: Yes  Provider:  1  Comment: "Node 2"
     (dbname=pagila2 host=myfreebsd2 user=postgres)

The script has reported the Slony, PostgreSQL, and psql versions, the Slony schema name, and shows that there's only one set, replicated from the master node to one slave node, inc

[continue reading]

On 19th of January Tom Lane committed really brilliant patch: Log Message: ----------- Add pg_table_size() and pg_indexes_size() to provide more user-friendly wrappers around the pg_relation_size() function.   Bernd Helmle, reviewed by Greg Smith Why is it brilliant? Because I’m lazy. And I think it’s a virtue, and not flaw. Let’s consider this example: create table x ( id serial primary key, [...]
Posted by Robert Gravsjö on 2010-02-01 at 16:35:33
I really don't know if this is funny or sad:
Let's say you have a table and a data set, and would like to add only those rows in your data set that aren't already in the table. There are hard ways, but here's an easy one.
Continue reading "Adding Only New Rows (INSERT IGNORE, Done Right)"
The BIG feature. The feature that made PostgreSQL leap from 8.4 to 9.0. Patch was written by Fujii Masao, and committed by Heikki Linnakangas on 15th of January 2010: Log Message: ----------- Introduce Streaming Replication.   This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while [...]
Posted by Peter Eisentraut on 2010-01-31 at 21:33:13
I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting

See you there! Or maybe even there.

Wait ... I have the last slot on Saturday and the first slot on Sunday?!? Great! :^)