PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Our GIN improvement (see Full-text search in PostgreSQL in milliseconds(PDF)) can seriously improve the performance of @> operator. Please, take a look on slide #47 from our presentation of nested hstore with array support, where we summarize performance of @> operator:

Hstore - seqscan, GiST, GIN
                              100s 400s - create index
                             64MB 815MB
                   0.98s 0.3s 0.1s
                                3x 10x

We see, that GIN index is 10x times faster than seqscan. But we can further improve GIN performance if we apply special optimization for (freq & rare) GIN index scan, which currently takes ~time(freq), but with our GIN improvement it takes ~time(rare). This improvement can greatly affect @> operator, since we index keys and values separately and @> operator can be viewed as (hstore @> key) & (hstore @> value). Usually, key is frequent and for rare value we can get serious improvement. Let's check this. I use the same dataset and the same query we used in our presentation at PGCon-2013.

=# select count(*) from hs where h::hstore @> 'tags=>{{term=>NYC}}'::hstore;
count
-------
285
(1 row)

Time: 17.372 ms

That's what we expected !

Key 'tags' is very frequent
=# select count(*) from hs where h::hstore ? 'tags';
count
---------
1138532
(1 row)

=# select count(*) from hs;
count
---------
1252973
(1 row)

Finally, the slide #47 should looks like:


Hstore - seqscan, GiST,  GIN  GIN+
                              100s 400s              - create index
                             64MB 815MB
                   0.98s 0.3s 0.1s    0.017s
                                3x 10x     57x

I think this is very good argument for considering our patch for GIN improvement, which is submitted to Commitfest (https://commitfest.postgresql.org/action/patch_view?id=1137)
Posted by Joshua Drake in CommandPrompt on 2013-06-19 at 19:51:58
We have backported the postgres_fdw to 9.2. It is read only of course as the infrastructure for writes is not in 9.2 but it is usable. Enjoy it!

A couple years ago, the Semantic Version specification was updated to require a hyphen between the “patch version” and the “prerelease version.” This despite the fact that v1.0.0 of the spec had been published on the site for a while (see all the gory issues here). Naturally, I had already implemented that format for PGXN in a Perl module and Postgres data type. Since we already had some PGXN extensions with the non-hyphenated format, and I knew supporting the hyphen would break them, I held off updating those implementations for a year or so.

I finally broke down and updated them, though, to be fully compliant with the official v1.0.0 spec, and pushed the changes to the PGXN server a few months ago. And I was right: it did break things. Links broke, downloads failed, and my mailbox filled up with error messages. I triaged the worst of the issues, but since then, accessing PGXN distributions with prerelease versions has not worked so well.

Until last night. I finally got some tuits in the last month to dig into this issue and figure out how to fix it. As a result, there are new releases of PGXN Manager, the API, and the site, as well as the META.json validator that should keep things consistent and working well going forward. Assuming there are no more backwards-incompatible changes to semantic versions, of course.

But that still didn’t solve the problem of existing distributions with the old format of semantic version. Alas, I had to download them all, modify them, and re-index them. This means that, if you had URLs to prerelease versions laying around, they won’t work anymore. Fortunately, they’re prerelease versions, so I wouldn’t expect many to have them lying around anyway.

Still in the interest of disclosure (and because their SHA1s have changed), here’s a list of the changed distributions:

[continue reading]

Having a regression test suite has been a long standing TODO item. Now that we have pgpool_setup, I started to think about making it real. Today I think I finally made it.

The idea behind it is
  • create required pgpool cluster (pgpool + PostgreSQL instances) by using pgpool_setup
  • each test case is a simple shell script
  • each script returns 0 or 1 according to the test result
Sounds too simple? Yes, I admit. However it works as expected.

$ sh regress.sh
testing 001.load_balance...ok.
testing 002.native_replication...ok.
testing 003.failover...ok.
testing 050.bug58...ok.
testing 051.bug60...ok.
testing 052.do_query...ok.
testing 053.insert_lock_hangs...ok.
out of 7 ok:7 failed:0


For example, "001.load_balance" checks whether load balancing is working against 2 node native replication or streaming replication cluster by using this simple script:

#! /bin/sh
#-------------------------------------------------------------------
# test script for load balancing.
#
source $TESTLIBS
TESTDIR=testdir

for mode in s r
do
    rm -fr $TESTDIR
    mkdir $TESTDIR
    cd $TESTDIR

# create test environment
    echo -n "creating test environment..."
    sh $PGPOOL_SETUP -m $mode -n 2 || exit 1
    echo "done."

    source ./bashrc.ports

    echo "backend_weight0 = 0" >> etc/pgpool.conf
    echo "backend_weight1 = 1" >> etc/pgpool.conf
    echo "black_function_list = 'f1'" >> etc/pgpool.conf

    ./startall

    export PGPORT=$PGPOOL_PORT

    wait_for_pgpool_startup

    psql test <<EOF
CREATE TABLE t1(i INTEGER);
CREATE FUNCTION f1(INTEGER) returns INTEGER AS 'SELECT \$1' LANGUAGE SQL;
SELECT * FROM t1;        -- this load balances
SELECT f1(1);        -- this does not load balance
EOF

# check if simle load balance worked
    fgrep "SELECT * FROM t1;" log/pgpool.log |grep "DB node id: 1">/dev/null 2>&1
    if [ $? != 0 ];then
    # expected result not found
        ./shutdownall
        exit 1
    fi

# check if black function list worked
    fgrep "SELECT f1(1);" log/pgpool.log |grep "DB node id: 0">/dev/null 2>&1
    if [

[continue reading]

Autopex is the brainchild of a long night at the Royal Oak. It ties together Pex and event triggers to magically download and build any extension that you install. So after you have set everything up you can do, say, CREATE EXTENSION plproxy, and it will transparently download and build plproxy for you. (Actually, this only works if the extension name is the same as the package name. I'm planning to fix that.)

Note 1: You can't install Autopex via Pex, yet.

Note 2: I guess the next logical step would be Autoautopex, which installs Autopex and Pex automatically somehow. Patches welcome.

I suppose with logical replication, this might actually end up installing the extension code on the replication slaves as well. That would be pretty neat.

Posted by Josh Berkus in pgExperts on 2013-06-19 at 00:41:00
You already know that I'm a big fan of custom aggregates.  What's even more fun is when you combine them with the ordered aggregate feature from PostgreSQL 9.0.  For example, let's take a fairly common task: selecting the item with the maximum importance.

This is a challenge for standard aggregation because you want to display a value which relates to an importance score contained in another table.  For example, the table could be a lookup list of error types, and you want to display the most severe error.  This is simple to do if you only want one row, but what if you want to group by other columns?  Well, here's a relatively neat way using ordered aggregates and custom aggregates.

First, we want to create a generic aggregate called "first", which simply returns the first non-null value in the column:

    CREATE FUNCTION first_state (
        anyelement, anyelement )
    RETURNS anyelement
    LANGUAGE SQL
    IMMUTABLE STRICT
    AS $f$
        SELECT $1;
    $f$;


    CREATE AGGREGATE first ( anyelement ) (
            SFUNC = first_state,
            STYPE = anyelement
    );


This is a generally useful aggregate to have around, btw, so I suggest adding it to your general library.

Now, let's look at our tables:

    table errors (
        error_code text primary key,
        severity int not null default 0,
        description text
    )

    table error_log (
        lo_time timestamptz,
        error_code text    

    )

So, say I wanted to see the count of errors for each hour in the log and the most severe error during that hour?  Well, what I need to do is to select the first error code which shows up when I order by severity.  For good measure, I order by error_code as well, in case there's two codes with the same severity:

    SELECT date_trunc('hour',log_time) as log_hour,
        count(*) as error_count,
        first(error_code order by severity desc, error_code) 

            as most_severe
    FROM error_log
        JOIN errors USING (error_code)
    GROUP BY date_trunc('hour',log_time)
    ORD

[continue reading]

Posted by Michael Paquier in VMware on 2013-06-18 at 05:14:23
Creating RPM packages for your own needs can be quite an adventure the first time, especially with a code base as complex as Postgres considering all the submodules, options and extensions that the core code can come up with. But thanks to all the work done by the community, the building process is quite easy. [...]
Posted by Pierre Ducroquet on 2013-06-17 at 22:26:46

After almost two years of lobbying at work to promote PostgreSQL instead of MySQL, after two years of tears against poor DB typing, lack of advanced SQL features, traps in what seemed like basic needs (select * from myview where a=5, if myview is not «simple», triggers a full scan on the view result)… we are finally starting to deploy some PostgreSQL databases.

I wrote my dear sysadmin friends documentations, argumentations. Among my «it’s simpler» arguments, I promoted application_name as a way to quickly know who is hurting that bad your SQL server, along with the so simple ps faux to know who is doing what on your server… That’s the simplest monitoring possible, but when you’re in a hurry at 3AM, it can help you… Sadly, the application_name is not displayed in the ps output. I was looking for a tiny thing to patch PostgreSQL in order to discover its code a bit, it seemed like a good opportunity.

And let’s share also how I did the patch to highlight how easy patching PostgreSQL is :) (the patch has been written together with this blog entry)

 

0) System setup

First thing, you need a minimal development environment for PostgreSQL. Your tools : your favorite compiler (GCC), make, git, $EDITOR, ack-grep and a bit of patience if your CPU is slow (postgresql compiles quite quickly).


~$ cd ~/projects
~/projects$ git clone git://git.postgresql.org/git/postgresql.git
~/projects$ cd postgresql
~/projects/postgresql$ git checkout -b application-name-in-proctitle

1) Exploring the code

We want to modify how the proctitle is constructed. Under BSD systems, it’s done through the setproctitle call, easier to find than the game with ARGV under linux. Soo… let’s search it.

–/projects/postgresql$ ack-grep --type cc setproctitle
src/backend/utils/misc/ps_status.c
42: * use the function setproctitle(const char *, ...)
265: * apparently setproctitle() already adds a `progname:' prefix to the ps
329: setproctitle("%s", ps_buffer);

src/include/pg_config.h
422:/* Define to 1 if you have the `setproctitle’ function. */

Well, only one

[continue reading]

In my last post, I said there is a pull request by Pēteris Ņikiforovs to add support for EF-6 to Npgsql. Yesterday, I merged this pull request to the master branch of Npgsql.

With this merge, Npgsql has officially initial support for EF-6!

How to compile

For now, in order to compile Npgsql to use EF-6, you have to open the solution file NpgsqlEF6.sln. Later, as suggested by Pēteris Ņikiforovs, the idea is that we create a new configuration inside main project solution instead of maintain 2 separated projects.

Another thing you will need to compile Npgsql is the latest release of EntityFramework assembly through NuGet:

PM> Install-Package EntityFramework -Pre

That's it! Now you will be able to play with Npgsql and EF-6. Check out my previous post about how to use Npgsql with EntityFramework.

I'd like to thank Pēteris Ņikiforovs for his patch. And maxbundchen for his patch about Open/Close events needed for EF-6.

Please, give it a try and let me know how it works for you.
Posted by Leo Hsu and Regina Obe on 2013-06-17 at 03:28:00

I've been having issues with PostgreSQL error handling mostly on windows 64-bit using binaries I build with mingw64 and using under VC++ built PostgreSQL. Most of which I don't understand. As many have pointed out these issues MIGHT go away or be easier to debug if PostGIS was built with same tool chain as EDB VC++ distributions. I can't argue with that especially since its never been done.

The good news is changing my build chain from mingw64 GCC 4.5.4 to mingw64 GCC 4.8.0 has eradicated all of these issues, and even ones I had with PL/V8 that happended both in a pure mingw64(w32/w64) as well as PostgreSQL VCC (32/64 bit) environment. It did introduce this minor annoying nat, presumably because my libpq.dll now has a dependency on user32.dll (which just seems wrong). The other nat is that all the binaries I built that have c++ in them (depend on libstd++) now need to be recompiled which means my prior builds of PostGIS will not be compatible with my upcoming 4.8.0 ones. A real pain since for PostGIS/pgRouting I've got like at least 15 of those dependencies and growing. So I've decided to perform this exercise just for PostgreSQL 9.3 -- a new beginning at PostGIS 2.1 and see how it goes before I bother with 9.2, 9.1. These issues really only affect PostgreSQL 64-bit PostGIS users and as a 64-bit user, you may never have even come across them.


Continue reading "CMake support for PostGIS planned"
Posted by Josh Berkus in pgExperts on 2013-06-16 at 18:46:00
The first CommitFest for version 9.4 of PostgreSQL has started, and we have an inspiring list of new features, in 98 patches, ready to be reviewed.  Which means we need you to help review them!  Among the pending patches are:
Given that that's nearly 100 patches, and the PostgreSQL project has only 20 committers, it means we need you to help us review patches.  Yes, you!  You don't have to be able to hack PostgreSQL code to review a patch; if you can apply a patch to source code and build PostgreSQL using "make", you can help.  More information here in How To Review A Patch.

If you want to help, but aren't sure what to review, please join the pgsql-rrreviewers mailing list and announce yourself there.  I will assign you a patch to review.

For those of you who are experienced reviewers or who have submitted a patch to this CommitFest, please note the following:
  1. If you submitted a patch to this CommitFest, you are expected to review at least one other person's patch.  Should you choose not to do so, the community will not be obligated to guarantee you a review of your patch.
  2. If you put your name down as a reviewer on a patch, and you don't post anything for 5 days, we will send you a reminder and take your name off the patch, to prevent you from blocking other reviewers.
  3. If your patch goes to "waiting on author" and you don't respond for 5 days, the patch will be Returned with Feedback.
Rules 2 and 3 are new per this year's developer meeting.  Thanks, and lets have a fast and productive CommitFest!







[continue reading]

Version 4.11 of the PostgreSQL Buildfarm client has been released. It can be downloaded from http://www.pgbuildfarm.org/downloads/releases/build-farm-4_11.tgz

Changes since 4.10:

  • Turn down module cleanup verbosity
  • Add check for rogue postmasters.
  • Add pseudo-branch targets HEAD_PLUS_LATEST and HEAD_PLUS_LATEST2.
  • Use Digest::SHA instead of Digest::SHA1.
  • Make directory handling more robust in git code.
  • Move web transaction into a module procedure.
  • Switch to using the porcelain format of git status.
  • Provide parameter for core file patterns.
  • Use a command file for gdb instead of the -ex option

The web transaction and Digest::SHA changes have allowed the removal of a couple of long-standing uglinesses on the system. In almost all cases, the config parameter "aux_path" and the separate run_web_transaction.pl script are now redundant (the exception is older Msys systems).


Enjoy
It is simple: format c:/ and use Linux.
I’ve been a fan of PostgreSQL since 2000 when I switched to it from MySQL. I wanted stored procedures and functions. I got that with PostgreSQL. I was used to having such features from my years working with other big databases such as DB2, Oracle, and Sybase. I’ve been moving towards using the custom format [...]
Posted by Szymon Guz in EndPoint on 2013-06-12 at 14:00:00

PostgreSQL can be installed using installers prepared for your operation system. However this way you just depend on the installation settings chosen by the packages mainainers. Installation requires root privileges, on some machines programmers are not allowed to do that. What’s more, this way you rather will not install the PostgreSQL beta version.

The only way to install Postgres without root privileges, in home directory, is to compile it from sources. That’s not very difficult.

Download Sources

First of all you need to download sources. I use Github for getting the latest sources. There is Postgres Github mirror. I clone that, but you could just download zip file.

Unpack it somewhere, and you have the Postgres sources you need.

Install Needed Software

For compiling Postgres you will need some libraries and programs. The complete list can be found in Postgres documentation.

I’m using Ubuntu, the packages I use for compiling Postgres are:

  • gcc – C compiler
  • libreadline6, libreadline6-dev – readline support
  • zlib1g, zlib1g-dev – compression library used internally by Postgres
  • libpython2.7, libpython2.7-dev – for compiling with plPython support

If you are using different system, or different system/Postgres version, then your packages/libraries can be named differently.

Configure

Now you should enter the directory where your sources are and run below command for source configuration:

./configure --prefix=$HOME/postgres/ --with-python PYTHON=/usr/bin/python2.7

The --prefix parameter shows the path where Postgres will be installed.

The --with-python parameter enables compiling with plpython support.

PYTHON parameter points to current python binary installation.

The configure command should finish without any errors. If you have any errors, most probably you don’t have some needed libraries installed.

Compile

If configure succeeded, you can compile the sources. It is simple:

make -j 4

The -j parameter allows for this maximum number of jobs at the same time.

My computer has 4 cores, I want to use all of them, th

[continue reading]

Posted by Oleg Bartunov on 2013-06-11 at 09:17:53
Our customer provides us an interesting fts puzzle - result of fts query depend on the order of arguments of @@ operator. Of course, it shouldn't be happened, that's why everebody were confused. After close look we found the problem - they used textual representation of tsvector without explicit cast to tsvector, so it was converted to tsvector again (using default_text_search_config) and it happens that tsquery contained a number 77, which was absent in the original text, but appeared in the final tsvector (there was some word at position 77). It happens only if tsquery was at right and never - when tsquery was left argument. There are three form of text search operator -
TSVECTOR @@ TSQUERY, TSQUERY @@ TSVECTOR, TEXT @@ TSQUERY, that's why postgres silently does search without complaining. Always specify explicit cast to avoid such kind of confusion.

Here are examples:

=# select 'ate:3 fat:5 mice:2 one:1 rat:6' @@ '1 & mice';
?column?
----------
t

=# select '1 & mice' @@ 'ate:3 fat:5 mice:2 one:1 rat:6';
?column?
----------
f

Explicit cast made us happy !

=# select 'ate:3 fat:5 mice:2 one:1 rat:6'::tsvector @@ '1 & mice';
?column?
----------
f
I have been wondering a while about the optimal block size in PostgreSQL. Usually the default value of 8kb has proven to be beneficial for most applications. However, after listening to some talk about MonetDB, I got interested in testing various workloads for different blocksizes – especially with blocksizes larger than PostgreSQL’s current maximum of [...]
One of the challenges of setting up a cloud hosting environment is ensuring that systems are as configured and that the desired state is one which can be rebuilt if there is a problem.  We at Efficito focus on data backups and rebuilding software images to consistent states rather than backing up full images and restoring them.  This helps ensure that in the event of a disaster we can ensure that VMs are restored to a consistent software state with data restored from backup.

Our choice on architecture was guided by the following requirements:

  1. Configuration and building of virtual machines should be subject to automation without human intervention, with full integration of payment frameworks and the like.
  2. Configuration and building of virtual machines should be such that virtual machines can be fully rebuilt in the event of disaster recovery requirements.
  3. Configuration changes should be able to be automated and retriable.
  4. This can be specific to hosted clouds for specific applications (we only host LedgerSMB as an ERP solution).
Further posts will probably cover very small pieces of our system.  The entire system cannot be published here in part because we want to preserve our trade secrets.  This post however just covers some ways we use PostgreSQL as the centerpiece of this environment.

The Basic Structure and Role of PostgreSQL


Our approach is relatively simple.  Data comes in through either an administrative or customer portal, transmitted to a limited API which then goes into our configuration database.  Information can be requests for new virtual machines, configuration changes and the like.  Additionally payment notifications can come in through these interfaces as well.

PostgreSQL is then attached to the configuration system which picks up notifications of needed configuration changes and orchestrates these on the system.  This also allows us to pull information on our service deployments into our financial system for billing purposes (we use LedgerSMB beta versions of 1.4 internally, eating our own

[continue reading]

Posted by Szymon Guz in EndPoint on 2013-06-10 at 15:17:00

PostgreSQL has got the great feature named “functional indexes”. A normal index just stores sorted values of some field. It is great for searching, as the values are already sorted.

You can create an index with a simple query like:

CREATE INDEX i_test ON test (i);

It will store all values of column i from table test. This index can be used with a query like:

SELECT * FROM test WHERE i < 100 ORDER BY i;

Functional Indexes

There is also something I like most. Index can store all values you want, they don’t need to be values from the table. You can use values calculated from the table columns. They will be sorted, so searching with those indexes will be pretty fast.

Creating such index is simple:

CREATE INDEX i_test_lower_i ON test (lower(i));

The main rule is: this index can be used if you have the same function call in your query, something like:

SELECT * FROM test WHERE lower(i) = 'aaa';

Example

Let’s check something more complicated. My test table looks like:

CREATE TABLE test(t timestamp);

I filled this table with sample data. We need some bigger number of rows:

INSERT INTO test(t) SELECT generate_series(now() - '1 year'::interval, now(), '1 minute');

This way there are 500k rows.

I need to get two row sets from database. First I will get the rows with dates from the last 10 days. Later I will get all rows with dates from current year.

The Last 10 Days

I can get the rows with dates from the last 10 days like:

postgres=# explain analyze select t from test where t::date > (now() - '10 days'::interval)::date;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..14152.02 rows=175200 width=8) (actual time=265.640..272.701 rows=13558 loops=1)
   Filter: ((t)::date > ((now() - '10 days'::interval))::date)
   Rows Removed by Filter: 512043
 Total runtime: 273.152 ms
(4 rows)

For speeding this up I will crea

[continue reading]

Posted by Dan Langille on 2013-06-10 at 08:23:18
I attended a few of the talks at PGCon 2013 last month. One talk, for which I took several notes and made a few choice tweets. The main one I’m following up on is using the -Fc option on pg_dump. It was during Magnus Haganders’ talk on PostgreSQL Backup Strategies that I posted that tweet. [...]
Posted by Guillaume LELARGE in Dalibo on 2013-06-09 at 08:42:00

Last week, one of my customers asked me during a training why there is no foreign data wrapper for sqlite (actually, you can have one with multicorn, but there's no native one). I have no idea why but I was happy to learn that no sqlite FDW already existed. I wanted to write one for quite some time now, and it appeared to be the perfect idea.

So, in the evening, I started working on one. I took as foundation the blackhole foreign data wrapper, written by Andrew Dunstan. It helped a lot to start quickly. I found a bit surprising that it didn't include #ifdef for some functions and hooks, so that it could be compatible with 9.1 and 9.2. I added them in a patch that you can find here if you need them. Otherwise, you can simply delete some parts of the blackhole code.

After that, I tried to find how to open and close a sqlite database, and how to read a table in it. They have a really nice example in their "Quick start document" and "Introduction To The SQLite C/C++ Interface".

I wanted something really simple to start with. Almost two years ago, at pgconf.eu, I went to see Dave Page's talk on FDW (PostgreSQL at the center of your dataverse). So I already knew that you mainly need three functions to read a table: BeginForeignScan, IterateForeignScan, EndForeignScan. The first one has to open the connection (if it's not opened yet). The second one will be executed as many times as there are rows to grab. On its first execution, it must launch the query and get the first row. On every other iteration, it will grab a new row. And the third function helps cleaning memory. So I started to include the sqlite tutorial code in the blackhole FDW. And it worked great. I had to write some other functions, specifically the handler and the validator, but,in an hour, I had something working. Of course, it wasn't pretty. The database filename was written in the code, with no option to change it. The query executed remotely was also written in the code, which means you couldn't change the tablename without recompiling.

So I started

[continue reading]


After reading the excellent article about entity framework on Postgresql by Brice Lambson, I decided to write this post to document my experience playing with Entity Framework 4.3.1 and Npgsql. This post will be an adaptation of the Code First To a New Database walkthrough in order to make it work with Npgsql. 

First Steps

You should follow the first 4 steps of Code First To a New Database. Go ahead, I''l wait for you.

Next steps

Here is where the adaptation of the walkthrough begins. As Brice noted in his post, Npgsql currently doesn't support database creation. ( I'm working on that and hope to get news about it soon.) So, for while, you have to create the database manually.

Those are the steps you have to do to create the database and the model:

First, run this command in the terminal to create the database (or you can use pgAdmin if you prefer a GUI):

> createdb ef_code_first_sample 

After that, you have to run the following commands inside the database you have just created (to simplify permissions, remember to run this script connected as the same user who is specified in your connection string):

create table "Blog" ("BlogId" serial, "Name" varchar(255));
create table "Post" ("PostId" serial, "Title" varchar(255), "Content" varchar(8000), "BlogId" int);

And here comes the first trick you have to use when working with EF and Npgsql: the table names as well as column names need to be double quoted

Entity Framework generates code with table and column names double quoted and, to Postgresql, using double quotes means you want to preserve the casing of the names. So you need to create the tables with the correct case or else, Postgresql will complain it can't find your tables.

With the database and tables created, let's make some more configuration before we can run the sample.

Entity Framework installation

Unfortunately Npgsql doesn't support EF 5. Currently it supports 4.3.1 and there is a pull request by Pēteris Ņikiforovs to add support for EF 6. Yeah, Npgsql will have support for latest EF v

[continue reading]

With the revelations of massive surveillance by the NSA on Americans, I have thought a bit about how to securely set up end to end cryptography in order to offer guarantees of security.  This is by no means limited to offering security relative to governments, but includes the fact that organized criminals can mount attacks similar to wiretaps using man in in the middle and other possible attack techniques.

Designing a perfectly secure system is probably not possible.  A determined enough attacker will be able to gain access to any communications given enough effort, resources, and determination.  The goals of the system I am describing however would be to maximize the resources required,  and thus force evesdroppers to focus on only the most valuable targets possible and causing as narrow compromises as possible.  Additionally metadata is to some extent impossible to protect to the same extent content is.

In general SSL is good enough for key negotiation etc. if the system can be made resistant (not perfectly secure) against man in the middle attacks and if authorities can be sufficiently trusted and validated over time.

Most key exchange approaches focus solely on minimizing risk at the moment of key exchange (i.e. reducing synchronic risk).  This approach is different in that it focuses on resistance to exposure over time (reducing diachronic risk) and seeking to provide as much notification of compromised communications as possible.

Such an approach will *not* protect people against government spying in jurisdictions where keys can be demanded via subpoena or even warrant.  However this approach seeks to force authorities to seek the keys from the people under surveillance and not rely on digital surveillance.

In general as much as I dislike SSL/TLS (as I dislike pretty much every attempt to port OSI protocols to TCP/IP) it is well developed and well understood and the security dimensions of the protocol are well documented.  Additionally unlike IPSec, it is appropriate for cases where the data may need

[continue reading]

pg_top is a monitoring tool designed for PostgreSQL in a way similar to top. When using it you can get a grab at the process activity of your server with an output similar to that: last pid: 425; load avg: 0.07, 0.03, 0.02; up 0+00:52:08 12:22:02 1 processes: 1 sleeping CPU states: 0.5% user, 0.0% [...]
Posted by Greg Smith in 2ndQuadrant on 2013-06-07 at 01:22:47

First today is a PostgreSQL community blogger note. Those of you who publish to the Planet PostgreSQL blog feed should take a look at the updated Planet PostgreSQL Policy. There’s a new clause there clarifying when it’s appropriate to mention promotions of commercial products like books. We’re trying to keep every blog post to the Planet feed focused on providing useful information, and just informing people of things like product giveaways doesn’t meet that standard. Several of these have gone by recently, but moving forward that will be considered a violation of the rules.

Speaking of what it’s safe to write about, I’ve published the next installment in my growing documentation set around disks that work well with PostgreSQL. Disk Plug Pull Testing covers how to use the diskchecker.pl program to validate your disks are working as expected for database use. I’m going to include that information in every disk review I do moving forward, even though I normally only talk about ones that pass this test. I think it’s that important to emphasize how vital this is. If someone recommends a drive and doesn’t explicitly tell you they run a plug pull test, be suspicious.

I’m also working on a series of Postgres Guides for how to setup related hardware like RAID controllers. So far there’s a long discussion of the LSI MegaRAID controllers there. I’m almost done with an Adaptec controller one too. It’s hard to figure all this out and the information needed changes regularly. I’m going to make this spot the primary place I maintain this data at, mainly so I can keep everything consistent.

The post Help with getting reliable database writes appeared first on High Performance PostgreSQL.

When: 7-9pm Thu June 20, 2013
Where: Iovation
Who: Group
What: Lightning Talks

Last month we had a surprise special guest – Emily Strickland spoke to us about the way Rentrak uses Postgres. It turned into a big group discussion, my favorite kind of meeting.

We’ll be doing Lightning Talks for our June meeting. This time we mean it :) The link to sign up is on the mailing list. See the May meeting announcement for more information.

Packt has promised us some books to give away as well: a couple copies each of PostgreSQL 9.0 High Performance and the PostgreSQL 9 Admin Cookbook, as well as some of their newer eBooks: Instant PostgreSQL Backup and Restore and Instant PostgreSQL Starter. You must be present to win, of course.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

Building security will close access to the floor at 7:30.

After-meeting beer location: Huber’s. See you there!


I don't know if it is a mainline kernel problem but I can tell you that on Ubuntu Precise, Linux kernel 3.2 is a disaster for PostgreSQL. I am not even going to go into a huge rant about it. I am just posting the numbers. See for yourself. There should be a public service announcement about it.

before upgrade to 3.9

08:35:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:45:01 AM     all     30.91      0.00      5.66     40.05      0.00     23.38
08:55:02 AM     all     29.32      0.00      5.10     39.66      0.00     25.92
09:05:02 AM     all     31.71      0.00      6.24     40.99      0.00     21.06
09:15:01 AM     all     32.45      0.00      6.59     46.74      0.00     14.21
09:25:01 AM     all     20.62      0.00      5.39     60.00      0.00     14.00
09:35:01 AM     all     31.03      0.00      3.61     33.95      0.00     31.41
09:45:01 AM     all     36.54      0.00      3.22     34.13      0.00     26.11
09:55:02 AM     all     40.17      0.00      3.66     30.98      0.00     25.19
10:05:01 AM     all     33.49      0.00      3.04     32.28      0.00     31.19
10:15:01 AM     all     48.63      0.00      2.87     25.50      0.00     23.00
10:25:01 AM     all     51.34      0.00      3.56     26.06      0.00     19.04
10:35:01 AM     all     39.41      0.00      3.44     29.86      0.00     27.29
10:45:02 AM     all     36.07      0.00      8.79     30.94      0.00     24.20
10:55:03 AM     all     38.04      0.00      7.98     32.98      0.00     21.01
11:05:11 AM     all     39.25      0.00      8.81     36.75      0.00     15.19
11:15:02 AM     all     35.19      0.00      8.76     41.98      0.00     14.07
11:25:03 AM     all     38.21      0.00      9.65     38.86      0.00     13.28
11:35:02 AM     all     42.92      0.00     11.66     34.28      0.00     11.14
11:45:02 AM     all     39.40      0.00      9.96     39.03      0.00     11.61
11:55:01 AM     all     28.72      0.00      3.27     36.32      0.00     31.69

after upgrade to 3.9

08:35:02 AM     all     40

[continue reading]

Posted by David Wheeler in pgExperts on 2013-06-06 at 19:02:00

I gave a tutorial at PGCon a couple weeks back, entitled “Agile Database Development with Git, Sqitch, and pgTAP.” It went well, I think. The Keynote document and an exported PDF have been posted on PGCon.org, and also uploaded to Speaker Deck. And embedded below, too. Want to follow along? Clone the tutorial Git repository and follow along. Here’s the teaser:

Read More »

Posted by Josh Berkus in pgExperts on 2013-06-06 at 01:42:00

Our first-ever unconference at pgCon was a smashing success this year, greatly enhancing collaboration and development for the upcoming versions of PostgreSQL.  Among other things, I think we got a specification for pluggable storage out of it, which is far beyond anything I'd expected.  Were you there?  How did you like it?  Give your feedback in the comments.


About 75 people attended, and hashed various topics out in 12 discussion sessions. These included expanding PostgreSQL testing, creating demo databases, enhancing full text search, upcoming JSON features, pg_upgrade, and of course the two-part Pluggable Storage/Foreign Data Wrapper discussion.  Anyway, it's all on the wiki if you missed it.

We're definitely doing one next year, so plan to stay an extra day at pgCon. 

I thank the following people for making the first unconference a success: Salesforce.com, our unconference sponsor;  Stacey Haysler, Nikhil Sontakke, Susanne Ebrecht, Hartmut and Ian Barwick for helping to run the event; and Dan Langille, conference organizer.  Thank you folks!

Yes, I know the Unconference was almost two weeks ago, but I lost my phone in Ottawa and this is the first I could get pictures up.
Posted by Joshua Tolley in EndPoint on 2013-06-05 at 15:48:00

Image by Wikimedia user Ardfern

Yesterday I ran on to a nice practical application of a number of slightly unusual SQL features, in particular, window functions. PostgreSQL has had window functions for quite a while now (since version 8.4, in fact, the oldest version still officially supported), but even though they're part of the SQL standard, window functions aren't necessarily a feature people use every day. As a bonus, I also threw in some common table expressions (also known as CTEs, also a SQL standard feature), to help break up what could have been a more confusing, complex query.

A client of ours noticed a problem in some new code they were working on. It was possible for users to submit duplicate orders to the system in quick succession, by double-clicking or something similar. This was fixed in the code easily enough, but we needed to clean up the duplicate orders in the database. Which meant we had to find them. We defined a group of duplicates as all orders involving the same line items, with one of a set of possible status codes, created in an interval of less than five minutes by the same user.

This discussion of the time interval between two different records should immediately signal "window functions" (or possibly a self-join, but window functions are much easier in this case). A window function takes a set of rows and lets you chop them up into subsets, processing the subsets in various ways. In this case, we want to take all the rows in the orders table with a particular status value, group them by the customer who placed the order as well as by the items in the order, and then evaluate each of those groups.

As might be expected, the items associated with an order are in a different table from the orders themselves. There are probably several different ways I could have compared the items; I chose to accumulate all the items in an order into an array, and compare the resulting arrays (I imagine this would be awfully slow if orders had many different items attached to them, but it wasn't a pr

[continue reading]