PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Michael Paquier in VMware on 2013-06-18 at 05:14:23
Creating RPM packages for your own needs can be quite an adventure the first time, especially with a code base as complex as Postgres considering all the submodules, options and extensions that the core code can come up with. But thanks to all the work done by the community, the building process is quite easy. [...]
Posted by Pierre Ducroquet on 2013-06-17 at 22:26:46

After almost two years of lobbying at work to promote PostgreSQL instead of MySQL, after two years of tears against poor DB typing, lack of advanced SQL features, traps in what seemed like basic needs (select * from myview where a=5, if myview is not «simple», triggers a full scan on the view result)… we are finally starting to deploy some PostgreSQL databases.

I wrote my dear sysadmin friends documentations, argumentations. Among my «it’s simpler» arguments, I promoted application_name as a way to quickly know who is hurting that bad your SQL server, along with the so simple ps faux to know who is doing what on your server… That’s the simplest monitoring possible, but when you’re in a hurry at 3AM, it can help you… Sadly, the application_name is not displayed in the ps output. I was looking for a tiny thing to patch PostgreSQL in order to discover its code a bit, it seemed like a good opportunity.

And let’s share also how I did the patch to highlight how easy patching PostgreSQL is :) (the patch has been written together with this blog entry)

 

0) System setup

First thing, you need a minimal development environment for PostgreSQL. Your tools : your favorite compiler (GCC), make, git, $EDITOR, ack-grep and a bit of patience if your CPU is slow (postgresql compiles quite quickly).


~$ cd ~/projects
~/projects$ git clone git://git.postgresql.org/git/postgresql.git
~/projects$ cd postgresql
~/projects/postgresql$ git checkout -b application-name-in-proctitle

1) Exploring the code

We want to modify how the proctitle is constructed. Under BSD systems, it’s done through the setproctitle call, easier to find than the game with ARGV under linux. Soo… let’s search it.

–/projects/postgresql$ ack-grep --type cc setproctitle
src/backend/utils/misc/ps_status.c
42: * use the function setproctitle(const char *, ...)
265: * apparently setproctitle() already adds a `progname:' prefix to the ps
329: setproctitle("%s", ps_buffer);

src/include/pg_config.h
422:/* Define to 1 if you have the `setproctitle’ function. */

Well, only one

[continue reading]

In my last post, I said there is a pull request by Pēteris Ņikiforovs to add support for EF-6 to Npgsql. Yesterday, I merged this pull request to the master branch of Npgsql.

With this merge, Npgsql has officially initial support for EF-6!

How to compile

For now, in order to compile Npgsql to use EF-6, you have to open the solution file NpgsqlEF6.sln. Later, as suggested by Pēteris Ņikiforovs, the idea is that we create a new configuration inside main project solution instead of maintain 2 separated projects.

Another thing you will need to compile Npgsql is the latest release of EntityFramework assembly through NuGet:

PM> Install-Package EntityFramework -Pre

That's it! Now you will be able to play with Npgsql and EF-6. Check out my previous post about how to use Npgsql with EntityFramework.

I'd like to thank Pēteris Ņikiforovs for his patch. And maxbundchen for his patch about Open/Close events needed for EF-6.

Please, give it a try and let me know how it works for you.
Posted by Leo Hsu and Regina Obe on 2013-06-17 at 03:28:00

I've been having issues with PostgreSQL error handling mostly on windows 64-bit using binaries I build with mingw64 and using under VC++ built PostgreSQL. Most of which I don't understand. As many have pointed out these issues MIGHT go away or be easier to debug if PostGIS was built with same tool chain as EDB VC++ distributions. I can't argue with that especially since its never been done.

The good news is changing my build chain from mingw64 GCC 4.5.4 to mingw64 GCC 4.8.0 has eradicated all of these issues, and even ones I had with PL/V8 that happended both in a pure mingw64(w32/w64) as well as PostgreSQL VCC (32/64 bit) environment. It did introduce this minor annoying nat, presumably because my libpq.dll now has a dependency on user32.dll (which just seems wrong). The other nat is that all the binaries I built that have c++ in them (depend on libstd++) now need to be recompiled which means my prior builds of PostGIS will not be compatible with my upcoming 4.8.0 ones. A real pain since for PostGIS/pgRouting I've got like at least 15 of those dependencies and growing. So I've decided to perform this exercise just for PostgreSQL 9.3 -- a new beginning at PostGIS 2.1 and see how it goes before I bother with 9.2, 9.1. These issues really only affect PostgreSQL 64-bit PostGIS users and as a 64-bit user, you may never have even come across them.


Continue reading "CMake support for PostGIS planned"
Posted by Josh Berkus in pgExperts on 2013-06-16 at 18:46:00
The first CommitFest for version 9.4 of PostgreSQL has started, and we have an inspiring list of new features, in 98 patches, ready to be reviewed.  Which means we need you to help review them!  Among the pending patches are:
Given that that's nearly 100 patches, and the PostgreSQL project has only 20 committers, it means we need you to help us review patches.  Yes, you!  You don't have to be able to hack PostgreSQL code to review a patch; if you can apply a patch to source code and build PostgreSQL using "make", you can help.  More information here in How To Review A Patch.

If you want to help, but aren't sure what to review, please join the pgsql-rrreviewers mailing list and announce yourself there.  I will assign you a patch to review.

For those of you who are experienced reviewers or who have submitted a patch to this CommitFest, please note the following:
  1. If you submitted a patch to this CommitFest, you are expected to review at least one other person's patch.  Should you choose not to do so, the community will not be obligated to guarantee you a review of your patch.
  2. If you put your name down as a reviewer on a patch, and you don't post anything for 5 days, we will send you a reminder and take your name off the patch, to prevent you from blocking other reviewers.
  3. If your patch goes to "waiting on author" and you don't respond for 5 days, the patch will be Returned with Feedback.
Rules 2 and 3 are new per this year's developer meeting.  Thanks, and lets have a fast and productive CommitFest!







[continue reading]

Version 4.11 of the PostgreSQL Buildfarm client has been released. It can be downloaded from http://www.pgbuildfarm.org/downloads/releases/build-farm-4_11.tgz

Changes since 4.10:

  • Turn down module cleanup verbosity
  • Add check for rogue postmasters.
  • Add pseudo-branch targets HEAD_PLUS_LATEST and HEAD_PLUS_LATEST2.
  • Use Digest::SHA instead of Digest::SHA1.
  • Make directory handling more robust in git code.
  • Move web transaction into a module procedure.
  • Switch to using the porcelain format of git status.
  • Provide parameter for core file patterns.
  • Use a command file for gdb instead of the -ex option

The web transaction and Digest::SHA changes have allowed the removal of a couple of long-standing uglinesses on the system. In almost all cases, the config parameter "aux_path" and the separate run_web_transaction.pl script are now redundant (the exception is older Msys systems).


Enjoy
It is simple: format c:/ and use Linux.
I’ve been a fan of PostgreSQL since 2000 when I switched to it from MySQL. I wanted stored procedures and functions. I got that with PostgreSQL. I was used to having such features from my years working with other big databases such as DB2, Oracle, and Sybase. I’ve been moving towards using the custom format [...]
Posted by Szymon Guz in EndPoint on 2013-06-12 at 14:00:00

PostgreSQL can be installed using installers prepared for your operation system. However this way you just depend on the installation settings chosen by the packages mainainers. Installation requires root privileges, on some machines programmers are not allowed to do that. What’s more, this way you rather will not install the PostgreSQL beta version.

The only way to install Postgres without root privileges, in home directory, is to compile it from sources. That’s not very difficult.

Download Sources

First of all you need to download sources. I use Github for getting the latest sources. There is Postgres Github mirror. I clone that, but you could just download zip file.

Unpack it somewhere, and you have the Postgres sources you need.

Install Needed Software

For compiling Postgres you will need some libraries and programs. The complete list can be found in Postgres documentation.

I’m using Ubuntu, the packages I use for compiling Postgres are:

  • gcc – C compiler
  • libreadline6, libreadline6-dev – readline support
  • zlib1g, zlib1g-dev – compression library used internally by Postgres
  • libpython2.7, libpython2.7-dev – for compiling with plPython support

If you are using different system, or different system/Postgres version, then your packages/libraries can be named differently.

Configure

Now you should enter the directory where your sources are and run below command for source configuration:

./configure --prefix=$HOME/postgres/ --with-python PYTHON=/usr/bin/python2.7

The --prefix parameter shows the path where Postgres will be installed.

The --with-python parameter enables compiling with plpython support.

PYTHON parameter points to current python binary installation.

The configure command should finish without any errors. If you have any errors, most probably you don’t have some needed libraries installed.

Compile

If configure succeeded, you can compile the sources. It is simple:

make -j 4

The -j parameter allows for this maximum number of jobs at the same time.

My computer has 4 cores, I want to use all of them, th

[continue reading]

Posted by Oleg Bartunov on 2013-06-11 at 09:17:53
Our customer provides us an interesting fts puzzle - result of fts query depend on the order of arguments of @@ operator. Of course, it shouldn't be happened, that's why everebody were confused. After close look we found the problem - they used textual representation of tsvector without explicit cast to tsvector, so it was converted to tsvector again (using default_text_search_config) and it happens that tsquery contained a number 77, which was absent in the original text, but appeared in the final tsvector (there was some word at position 77). It happens only if tsquery was at right and never - when tsquery was left argument. There are three form of text search operator -
TSVECTOR @@ TSQUERY, TSQUERY @@ TSVECTOR, TEXT @@ TSQUERY, that's why postgres silently does search without complaining. Always specify explicit cast to avoid such kind of confusion.

Here are examples:

=# select 'ate:3 fat:5 mice:2 one:1 rat:6' @@ '1 & mice';
?column?
----------
t

=# select '1 & mice' @@ 'ate:3 fat:5 mice:2 one:1 rat:6';
?column?
----------
f

Explicit cast made us happy !

=# select 'ate:3 fat:5 mice:2 one:1 rat:6'::tsvector @@ '1 & mice';
?column?
----------
f
I have been wondering a while about the optimal block size in PostgreSQL. Usually the default value of 8kb has proven to be beneficial for most applications. However, after listening to some talk about MonetDB, I got interested in testing various workloads for different blocksizes – especially with blocksizes larger than PostgreSQL’s current maximum of [...]
One of the challenges of setting up a cloud hosting environment is ensuring that systems are as configured and that the desired state is one which can be rebuilt if there is a problem.  We at Efficito focus on data backups and rebuilding software images to consistent states rather than backing up full images and restoring them.  This helps ensure that in the event of a disaster we can ensure that VMs are restored to a consistent software state with data restored from backup.

Our choice on architecture was guided by the following requirements:

  1. Configuration and building of virtual machines should be subject to automation without human intervention, with full integration of payment frameworks and the like.
  2. Configuration and building of virtual machines should be such that virtual machines can be fully rebuilt in the event of disaster recovery requirements.
  3. Configuration changes should be able to be automated and retriable.
  4. This can be specific to hosted clouds for specific applications (we only host LedgerSMB as an ERP solution).
Further posts will probably cover very small pieces of our system.  The entire system cannot be published here in part because we want to preserve our trade secrets.  This post however just covers some ways we use PostgreSQL as the centerpiece of this environment.

The Basic Structure and Role of PostgreSQL


Our approach is relatively simple.  Data comes in through either an administrative or customer portal, transmitted to a limited API which then goes into our configuration database.  Information can be requests for new virtual machines, configuration changes and the like.  Additionally payment notifications can come in through these interfaces as well.

PostgreSQL is then attached to the configuration system which picks up notifications of needed configuration changes and orchestrates these on the system.  This also allows us to pull information on our service deployments into our financial system for billing purposes (we use LedgerSMB beta versions of 1.4 internally, eating our own

[continue reading]

Posted by Szymon Guz in EndPoint on 2013-06-10 at 15:17:00

PostgreSQL has got the great feature named “functional indexes”. A normal index just stores sorted values of some field. It is great for searching, as the values are already sorted.

You can create an index with a simple query like:

CREATE INDEX i_test ON test (i);

It will store all values of column i from table test. This index can be used with a query like:

SELECT * FROM test WHERE i < 100 ORDER BY i;

Functional Indexes

There is also something I like most. Index can store all values you want, they don’t need to be values from the table. You can use values calculated from the table columns. They will be sorted, so searching with those indexes will be pretty fast.

Creating such index is simple:

CREATE INDEX i_test_lower_i ON test (lower(i));

The main rule is: this index can be used if you have the same function call in your query, something like:

SELECT * FROM test WHERE lower(i) = 'aaa';

Example

Let’s check something more complicated. My test table looks like:

CREATE TABLE test(t timestamp);

I filled this table with sample data. We need some bigger number of rows:

INSERT INTO test(t) SELECT generate_series(now() - '1 year'::interval, now(), '1 minute');

This way there are 500k rows.

I need to get two row sets from database. First I will get the rows with dates from the last 10 days. Later I will get all rows with dates from current year.

The Last 10 Days

I can get the rows with dates from the last 10 days like:

postgres=# explain analyze select t from test where t::date > (now() - '10 days'::interval)::date;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..14152.02 rows=175200 width=8) (actual time=265.640..272.701 rows=13558 loops=1)
   Filter: ((t)::date > ((now() - '10 days'::interval))::date)
   Rows Removed by Filter: 512043
 Total runtime: 273.152 ms
(4 rows)

For speeding this up I will crea

[continue reading]

Posted by Dan Langille on 2013-06-10 at 08:23:18
I attended a few of the talks at PGCon 2013 last month. One talk, for which I took several notes and made a few choice tweets. The main one I’m following up on is using the -Fc option on pg_dump. It was during Magnus Haganders’ talk on PostgreSQL Backup Strategies that I posted that tweet. [...]
Posted by Guillaume LELARGE in Dalibo on 2013-06-09 at 08:42:00

Last week, one of my customers asked me during a training why there is no foreign data wrapper for sqlite (actually, you can have one with multicorn, but there's no native one). I have no idea why but I was happy to learn that no sqlite FDW already existed. I wanted to write one for quite some time now, and it appeared to be the perfect idea.

So, in the evening, I started working on one. I took as foundation the blackhole foreign data wrapper, written by Andrew Dunstan. It helped a lot to start quickly. I found a bit surprising that it didn't include #ifdef for some functions and hooks, so that it could be compatible with 9.1 and 9.2. I added them in a patch that you can find here if you need them. Otherwise, you can simply delete some parts of the blackhole code.

After that, I tried to find how to open and close a sqlite database, and how to read a table in it. They have a really nice example in their "Quick start document" and "Introduction To The SQLite C/C++ Interface".

I wanted something really simple to start with. Almost two years ago, at pgconf.eu, I went to see Dave Page's talk on FDW (PostgreSQL at the center of your dataverse). So I already knew that you mainly need three functions to read a table: BeginForeignScan, IterateForeignScan, EndForeignScan. The first one has to open the connection (if it's not opened yet). The second one will be executed as many times as there are rows to grab. On its first execution, it must launch the query and get the first row. On every other iteration, it will grab a new row. And the third function helps cleaning memory. So I started to include the sqlite tutorial code in the blackhole FDW. And it worked great. I had to write some other functions, specifically the handler and the validator, but,in an hour, I had something working. Of course, it wasn't pretty. The database filename was written in the code, with no option to change it. The query executed remotely was also written in the code, which means you couldn't change the tablename without recompiling.

So I started

[continue reading]


After reading the excellent article about entity framework on Postgresql by Brice Lambson, I decided to write this post to document my experience playing with Entity Framework 4.3.1 and Npgsql. This post will be an adaptation of the Code First To a New Database walkthrough in order to make it work with Npgsql. 

First Steps

You should follow the first 4 steps of Code First To a New Database. Go ahead, I''l wait for you.

Next steps

Here is where the adaptation of the walkthrough begins. As Brice noted in his post, Npgsql currently doesn't support database creation. ( I'm working on that and hope to get news about it soon.) So, for while, you have to create the database manually.

Those are the steps you have to do to create the database and the model:

First, run this command in the terminal to create the database (or you can use pgAdmin if you prefer a GUI):

> createdb ef_code_first_sample 

After that, you have to run the following commands inside the database you have just created (to simplify permissions, remember to run this script connected as the same user who is specified in your connection string):

create table "Blog" ("BlogId" serial, "Name" varchar(255));
create table "Post" ("PostId" serial, "Title" varchar(255), "Content" varchar(8000), "BlogId" int);

And here comes the first trick you have to use when working with EF and Npgsql: the table names as well as column names need to be double quoted

Entity Framework generates code with table and column names double quoted and, to Postgresql, using double quotes means you want to preserve the casing of the names. So you need to create the tables with the correct case or else, Postgresql will complain it can't find your tables.

With the database and tables created, let's make some more configuration before we can run the sample.

Entity Framework installation

Unfortunately Npgsql doesn't support EF 5. Currently it supports 4.3.1 and there is a pull request by Pēteris Ņikiforovs to add support for EF 6. Yeah, Npgsql will have support for latest EF v

[continue reading]

With the revelations of massive surveillance by the NSA on Americans, I have thought a bit about how to securely set up end to end cryptography in order to offer guarantees of security.  This is by no means limited to offering security relative to governments, but includes the fact that organized criminals can mount attacks similar to wiretaps using man in in the middle and other possible attack techniques.

Designing a perfectly secure system is probably not possible.  A determined enough attacker will be able to gain access to any communications given enough effort, resources, and determination.  The goals of the system I am describing however would be to maximize the resources required,  and thus force evesdroppers to focus on only the most valuable targets possible and causing as narrow compromises as possible.  Additionally metadata is to some extent impossible to protect to the same extent content is.

In general SSL is good enough for key negotiation etc. if the system can be made resistant (not perfectly secure) against man in the middle attacks and if authorities can be sufficiently trusted and validated over time.

Most key exchange approaches focus solely on minimizing risk at the moment of key exchange (i.e. reducing synchronic risk).  This approach is different in that it focuses on resistance to exposure over time (reducing diachronic risk) and seeking to provide as much notification of compromised communications as possible.

Such an approach will *not* protect people against government spying in jurisdictions where keys can be demanded via subpoena or even warrant.  However this approach seeks to force authorities to seek the keys from the people under surveillance and not rely on digital surveillance.

In general as much as I dislike SSL/TLS (as I dislike pretty much every attempt to port OSI protocols to TCP/IP) it is well developed and well understood and the security dimensions of the protocol are well documented.  Additionally unlike IPSec, it is appropriate for cases where the data may need

[continue reading]

pg_top is a monitoring tool designed for PostgreSQL in a way similar to top. When using it you can get a grab at the process activity of your server with an output similar to that: last pid: 425; load avg: 0.07, 0.03, 0.02; up 0+00:52:08 12:22:02 1 processes: 1 sleeping CPU states: 0.5% user, 0.0% [...]
Posted by Greg Smith in 2ndQuadrant on 2013-06-07 at 01:22:47

First today is a PostgreSQL community blogger note. Those of you who publish to the Planet PostgreSQL blog feed should take a look at the updated Planet PostgreSQL Policy. There’s a new clause there clarifying when it’s appropriate to mention promotions of commercial products like books. We’re trying to keep every blog post to the Planet feed focused on providing useful information, and just informing people of things like product giveaways doesn’t meet that standard. Several of these have gone by recently, but moving forward that will be considered a violation of the rules.

Speaking of what it’s safe to write about, I’ve published the next installment in my growing documentation set around disks that work well with PostgreSQL. Disk Plug Pull Testing covers how to use the diskchecker.pl program to validate your disks are working as expected for database use. I’m going to include that information in every disk review I do moving forward, even though I normally only talk about ones that pass this test. I think it’s that important to emphasize how vital this is. If someone recommends a drive and doesn’t explicitly tell you they run a plug pull test, be suspicious.

I’m also working on a series of Postgres Guides for how to setup related hardware like RAID controllers. So far there’s a long discussion of the LSI MegaRAID controllers there. I’m almost done with an Adaptec controller one too. It’s hard to figure all this out and the information needed changes regularly. I’m going to make this spot the primary place I maintain this data at, mainly so I can keep everything consistent.

The post Help with getting reliable database writes appeared first on High Performance PostgreSQL.

When: 7-9pm Thu June 20, 2013
Where: Iovation
Who: Group
What: Lightning Talks

Last month we had a surprise special guest – Emily Strickland spoke to us about the way Rentrak uses Postgres. It turned into a big group discussion, my favorite kind of meeting.

We’ll be doing Lightning Talks for our June meeting. This time we mean it :) The link to sign up is on the mailing list. See the May meeting announcement for more information.

Packt has promised us some books to give away as well: a couple copies each of PostgreSQL 9.0 High Performance and the PostgreSQL 9 Admin Cookbook, as well as some of their newer eBooks: Instant PostgreSQL Backup and Restore and Instant PostgreSQL Starter. You must be present to win, of course.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

Building security will close access to the floor at 7:30.

After-meeting beer location: Huber’s. See you there!


I don't know if it is a mainline kernel problem but I can tell you that on Ubuntu Precise, Linux kernel 3.2 is a disaster for PostgreSQL. I am not even going to go into a huge rant about it. I am just posting the numbers. See for yourself. There should be a public service announcement about it.

before upgrade to 3.9

08:35:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:45:01 AM     all     30.91      0.00      5.66     40.05      0.00     23.38
08:55:02 AM     all     29.32      0.00      5.10     39.66      0.00     25.92
09:05:02 AM     all     31.71      0.00      6.24     40.99      0.00     21.06
09:15:01 AM     all     32.45      0.00      6.59     46.74      0.00     14.21
09:25:01 AM     all     20.62      0.00      5.39     60.00      0.00     14.00
09:35:01 AM     all     31.03      0.00      3.61     33.95      0.00     31.41
09:45:01 AM     all     36.54      0.00      3.22     34.13      0.00     26.11
09:55:02 AM     all     40.17      0.00      3.66     30.98      0.00     25.19
10:05:01 AM     all     33.49      0.00      3.04     32.28      0.00     31.19
10:15:01 AM     all     48.63      0.00      2.87     25.50      0.00     23.00
10:25:01 AM     all     51.34      0.00      3.56     26.06      0.00     19.04
10:35:01 AM     all     39.41      0.00      3.44     29.86      0.00     27.29
10:45:02 AM     all     36.07      0.00      8.79     30.94      0.00     24.20
10:55:03 AM     all     38.04      0.00      7.98     32.98      0.00     21.01
11:05:11 AM     all     39.25      0.00      8.81     36.75      0.00     15.19
11:15:02 AM     all     35.19      0.00      8.76     41.98      0.00     14.07
11:25:03 AM     all     38.21      0.00      9.65     38.86      0.00     13.28
11:35:02 AM     all     42.92      0.00     11.66     34.28      0.00     11.14
11:45:02 AM     all     39.40      0.00      9.96     39.03      0.00     11.61
11:55:01 AM     all     28.72      0.00      3.27     36.32      0.00     31.69

after upgrade to 3.9

08:35:02 AM     all     40

[continue reading]

Posted by David Wheeler in pgExperts on 2013-06-06 at 19:02:00

I gave a tutorial at PGCon a couple weeks back, entitled “Agile Database Development with Git, Sqitch, and pgTAP.” It went well, I think. The Keynote document and an exported PDF have been posted on PGCon.org, and also uploaded to Speaker Deck. And embedded below, too. Want to follow along? Clone the tutorial Git repository and follow along. Here’s the teaser:

Read More »

Posted by Josh Berkus in pgExperts on 2013-06-06 at 01:42:00

Our first-ever unconference at pgCon was a smashing success this year, greatly enhancing collaboration and development for the upcoming versions of PostgreSQL.  Among other things, I think we got a specification for pluggable storage out of it, which is far beyond anything I'd expected.  Were you there?  How did you like it?  Give your feedback in the comments.


About 75 people attended, and hashed various topics out in 12 discussion sessions. These included expanding PostgreSQL testing, creating demo databases, enhancing full text search, upcoming JSON features, pg_upgrade, and of course the two-part Pluggable Storage/Foreign Data Wrapper discussion.  Anyway, it's all on the wiki if you missed it.

We're definitely doing one next year, so plan to stay an extra day at pgCon. 

I thank the following people for making the first unconference a success: Salesforce.com, our unconference sponsor;  Stacey Haysler, Nikhil Sontakke, Susanne Ebrecht, Hartmut and Ian Barwick for helping to run the event; and Dan Langille, conference organizer.  Thank you folks!

Yes, I know the Unconference was almost two weeks ago, but I lost my phone in Ottawa and this is the first I could get pictures up.
Posted by Joshua Tolley in EndPoint on 2013-06-05 at 15:48:00

Image by Wikimedia user Ardfern

Yesterday I ran on to a nice practical application of a number of slightly unusual SQL features, in particular, window functions. PostgreSQL has had window functions for quite a while now (since version 8.4, in fact, the oldest version still officially supported), but even though they're part of the SQL standard, window functions aren't necessarily a feature people use every day. As a bonus, I also threw in some common table expressions (also known as CTEs, also a SQL standard feature), to help break up what could have been a more confusing, complex query.

A client of ours noticed a problem in some new code they were working on. It was possible for users to submit duplicate orders to the system in quick succession, by double-clicking or something similar. This was fixed in the code easily enough, but we needed to clean up the duplicate orders in the database. Which meant we had to find them. We defined a group of duplicates as all orders involving the same line items, with one of a set of possible status codes, created in an interval of less than five minutes by the same user.

This discussion of the time interval between two different records should immediately signal "window functions" (or possibly a self-join, but window functions are much easier in this case). A window function takes a set of rows and lets you chop them up into subsets, processing the subsets in various ways. In this case, we want to take all the rows in the orders table with a particular status value, group them by the customer who placed the order as well as by the items in the order, and then evaluate each of those groups.

As might be expected, the items associated with an order are in a different table from the orders themselves. There are probably several different ways I could have compared the items; I chose to accumulate all the items in an order into an array, and compare the resulting arrays (I imagine this would be awfully slow if orders had many different items attached to them, but it wasn't a pr

[continue reading]

Posted by Tatsuo Ishii in SRA OSS, Inc. on 2013-06-05 at 02:30:00
While developping pgpool-II 3.3, I created a small script to make developer's life a little bit easier. However, the tool might be useful for PostgreSQL/pgpool-II beginners. So I decided to introduce it in my blog.

The script is called "pgpool_setup".  It creates followings by just one command under current directory. The directory must be empty. PostgreSQL and pgpool binaries must be installed and in your command search path beforehand.
  • PostgreSQL clusters. You can choose "streaming replication mode" or "native replication mode". The former you already know, I assume. The latter is just a set of identical PostgreSQL clusters except the port numbers are different.
  • If you choose streaming replication mode, one primary server and standby servers are ready after the command runs (even with WAL archival is enabled).
  • By default those PostgreSQL cluster port number start from 11000 and continues 11001, 11002 and so on.
  • The port numbers of pgpool and pcp follow.
  • One pgpool server, with failover/online recovery/followmaster command are all set.
  • A test database "test" is created.
  • pcp commands(which are for pgpool administrator) account and password are ready for you.
By running pgpool_setup, you are ready to start pgpool and PostgreSQL with complete settings (at least for testing pupose) on your computer. I use this tool to create multiple groups  of pgpool+PostgreSQL set on my laptop.

So here's a sample session.

$ pgpool_setup
Satrting set up in streaming replication mode
creating startall and shutdownall
creating failover script
creating database cluster /home/t-ishii/work/git.postgresql.org/tmp2/data0...done.
update postgreql.conf
creating pgpool_remote_start
creating basebackup.sh
creating recovery.conf
creating database cluster /home/t-ishii/work/git.postgresql.org/tmp2/data1...done.
update postgreql.conf
creating pgpool_remote_start
creating basebackup.sh
creating recovery.conf
temporarily start data0 cluster to create extensions
temporarily start pgpool-II to create standby nodes
waiting for failover happens...done.
node_id

[continue reading]

On my laptop I've managed to create 1,001 local chained (one-to-one) streaming-only (meaning no archive directory) asynchronous replication instances. The output of the status of the list is here: https://gist.github.com/darkixion/5694200

I also tested the promotion of the 1st standby to see if it would cope with propagation to the final 1,000th standby, and it worked flawlessly. This didn't work on my copy of Linux Mint without some adjustments to the kernel semaphores values, and it does take a while for all the standbys in the chain to reach full recovery. However, promotion propagation is very fast.

Try it for yourself (if you have enough RAM that is). You may find it quicker to use my pg_rep_test tool. Just don't do this manually... it'll take far too long.

Thanks to Heikki for putting in the changes that made this archiveless cascading replication possible. :)

Posted by Andrew Dunstan in pgExperts on 2013-06-04 at 17:13:00
Today on IRC someone who has been testing out the new JSON stuff complained about a case I hadn't catered for, namely someone representing a Unicode character outside the Basic Multilingual Plane as a surrogate pair of characters constructed with the '\u' JSON escape. This is explicitly allowed in the JSON spec, so it needs to be handled. It's going to be a bit ugly to fix this, I suspect, but I'm glad to find out about it now rather than later. Thank goodness for beta testers.
Posted by Chris Travers on 2013-06-04 at 12:20:00
I have agreed to help found a LedgerSMB hosting business called Efficito.  We have a very basic web page up (which is in the process of further development) but we intend to offer hosting specifically of LedgerSMB 1.3 and higher to customers who want it.  If there is interest, please feel free to email me.

One thing we have committed to do is to leverage PostgreSQL for the overall management of our hosting services.  We intend to use the database as a point of original entry, allowing other processes to enact changes saved in the database.  This is something very different from a standard 3-Tier architecture in that the entry of data into the database triggers real-world side-effects in a distributed environment.

Obviously I cannot disclose our full codebase to the world, but I expect to be covering a number of aspects of how we will use PostgreSQL, and include small snippets of past or current versions of the code in these posts.  The examples will be naturally minimalistic and lack the context of the full system, but they should be interesting from a question of how one works with PostgreSQL in an automated environment and a number of other topics.

In essence what started off as a blog about LedgerSMB, which has expanded to include object-relational design generally, will now be expanded yet again to include database-centric automation.

I expect to cover  a few pieces in the near future including cross-type referential integrity where inet addresses match cidr blocks, and the use of ip4r to construct exclusion constraints for network ranges in table types.  The first post is probably more generally useful because understanding how to manually do referential integrity can help solve broader problems, like referential integrity over inheritance trees.

Anyway I hope you enjoy some of the newer areas of focus as well.
Posted by Tatsuo Ishii in SRA OSS, Inc. on 2013-06-04 at 06:57:00
We are going to release new version of pgpool-II 3.3 soon. Previous version 3.2 was released in last August. Since then over 12,000 copies are downloaded. 3.2 was very successfull, since 3.1 was only downloaded only 5,000 in almost same period. So we expect 3.3 to be even more successfull than 3.2. So what's new in 3.2 anyway?

Well 3.3 focuses on maturing 3.2: especially "watchdog" part. The watchdog was new in 3.2, allows to prevent a single point of failure of pgpool itself. I know we already have pgpool-HA, but the watchdog does more. It can make more than two pgpool's working together, for example, on line recovery.

In 3.3 the watchdog is much more enhanced:
  • New life check method (“heart beat”) added
  • Redundant heart beat device support
  • Secure protocol for heart beat and other messages
  • Interlocking of fail over/fail back/follow maser script
  • Monitor and auto restart watchdog process if it goes down
So please come and try 3.3 alpha now!

Based on a 2nd version of the patch implementing MVCC catalog access, here is a second set of results after the first try done last week. The same tests as last week are done, aka the backend startup time and the CREATE/DROP of multiple objects. The formula used to calculate performance comparison is the same [...]