PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Ernst-Georg Schmid on 2015-08-04 at 06:18:00
In the announcement of PipelineDB, Josh Berkus gave a hint that -  and how - a similar functionality might be achieved with stock PostgreSQL. Let's see, if we can analyze a continuous twitter stream too...

First, we need a fast table to act as the ringbuffer:

CREATE UNLOGGED TABLE tstest.tstream
(
  rec timestamp with time zone NOT NULL DEFAULT clock_timestamp(),
  tevent jsonb NOT NULL,
  CONSTRAINT pk_tstream PRIMARY KEY (rec)
)
WITH (
  OIDS=FALSE
);


To inject new tweets into the buffer and trim the outdated ones, I used this little injector, written in Java using the hosebird client and the example code provided there.

Once it runs in it's infinite loop, the table is populated with new tweets, while tweets older than one hour are evicted every five minutes by courtesy of a fixed rate timer task. So, the maximum size of the buffer table is one hour and five minutes worth of tweets in the worst case.

The analysis view is taken verbatim from the PipelineDB example:

  CREATE OR REPLACE VIEW tstest.tagstream as
    SELECT jsonb_array_elements(tevent #>
      ARRAY['entities','hashtags']) ->> 'text' AS tag
    FROM tstest.tstream
    WHERE rec >
          ( clock_timestamp() - interval '1 hour' );


Plus the missing ')' before ->> 'text'. :-)


As is the most popular tags query:

select tag, count(*) as tag_count from tstest.tagstream group
              by tag order by tag_count desc limit 10;


Does it work? Yes.

Since I don't have access to my development machine right now, stay tuned for Part 2 with some results.

One of the features coming in PostgreSQL 9.5 is the triumvirate GROUPING SETS, CUBE, and ROLLUP nicely covered in Bruce's recent slide deck. The neatest thing about PostgreSQL development is that when improvements happen, they don't just affect the core, but can be taken advantage of by extensions, without even lifting a finger. Such is the case with these features.

One of the things I was curious about with these new set of predicates is Would they work with any aggregate function?. I assumed they would, so decided to put it to the test, by using it with PostGIS ST_Union function (using PostGIS 2.2.0 development). This feature was not something the PostGIS Development group planned on supporting, but by the magic of PostgreSQL, PostGIS accidentally supports it. The grouping sets feature is particularly useful if you want to aggregate data multiple times, perhaps for display using the same dataset. It allows you to do it with a single query that in other PostgreSQL versions would require a UNION query. This is a rather boring example but hopefully you get the idea.


Continue reading "PostgreSQL 9.5 Grouping Sets with PostGIS spatial aggregates"
Posted by Bruce Momjian in EnterpriseDB on 2015-07-31 at 23:45:01

I delivered my first Postgres 9.5 features presentation in Saint Petersburg two weeks ago. I will be presentating it in California next month, and probably in other cities in the coming months.

Posted by Andreas 'ads' Scherbaum on 2015-07-31 at 19:31:22

The 4th South Bay PostgreSQL Meetup will take place on August 18th, in Palo Alto.

Please sign up here.

Speakers:

Mathias Brossard: PostgreSQL for the 99%

Bruce Momjian: Upcoming PostgreSQL 9.5 Features

Note: there is another Meetup the day before, in San Francisco.

As the first post dedicated to the feature coverage of Postgres 9.6 which is currently in active development, let's talk about the following commit:

commit: 0dc848b0314d63188919f1ce943730eac684dccd
author: Peter Eisentraut
date: Tue, 28 Jul 2015 20:31:35 -0400
pg_basebackup: Add --slot option

This option specifies a replication slot for WAL streaming (-X stream),
so that there can be continuous replication slot use between WAL
streaming during the base backup and the start of regular streaming
replication.

When taking a base backup with pg_basebackup, one has surely already experienced the problem that WAL files may be missing on the node from where the backup has been taken when connecting for example a fresh standby using this base backup after a certain delay. In such a case the standby node would complain about the following error.

FATAL:  could not receive data from WAL stream:
ERROR:  requested WAL segment 000000010000000000000003 has already been removed

This can be easily avoided by having a WAL archive with a proper restore_command set in the standby's recovery.conf or by tuning wal_keep_segments with a more or less appropriate number of segments corresponding to the amount of data generated between the moment the base backup has been started and the moment a standby node performing streaming and using this base backup connects to its parent node. In the former case, some users may not have a WAL archive set up (well they normally should to be able to recover from only base backups). In the later case, setting up wal_keep_segments is not an exact science, and if the server faces a peak of activity you may still finish with a missing WAL segments on the original node.

Well, this is where physical replication slots are actually useful, because once created and enabled for a given client, they are able to retain WAL segments as long at the slot's restart_lsn is not consumed by this client. Now, combined with pg_basebackup, what you actually get is the possibility to create a base backup and to ens

[continue reading]

Posted by Josh Berkus on 2015-07-30 at 22:51:07

main-image

Original Core Team member Bruce Momjian will be visiting the Bay Area this August, and will be doing two different talks, both for SF City and North Bay. If you want the full Brucealicious experience, you can go to both! For San Francisco, we're taking a vote on which topic Bruce should present.

Join us, and welcome Bruce.

Photo (c) 2006 Michael Glaesmann

Posted by Jason Petersen in CitusData on 2015-07-30 at 20:47:34

pg_shard continues to gain momentum as a straightforward sharding extension for PostgreSQL. We’re ecstatic each time we hear about a new deployment and are always considering what’s next.

We’ve been hard at work addressing some of those customer needs and have released a new version of pg_shard this week with our latest efforts. The changes in this release include:

  • Full CitusDB integration — Distribution metadata is always in sync with CitusDB
  • Better type support — Partition by enumeration or composite types
  • Planning improvements — Improved internal metadata locking and function checking
  • Usability enhancements — Better validations and error messages during use

Upgrading or installing is a breeze: see pg_shard’s GitHub page for detailed instructions.

Whether you want a distributed document store alongside your normal PostgreSQL tables or need the extra computational power afforded by a sharded cluster, pg_shard can help. We continue to grow pg_shard’s capabilities and are open to feature requests.

Got questions?

If you have any questions about pg_shard, please contact us using the pg_shard-users mailing list.

If you discover an issue when using pg_shard, please submit it to our issue tracker on GitHub.

Further information is available on our website, where you are free to contact us with any general questions you may have.

Read more...

Posted by Josh Berkus in pgExperts on 2015-07-30 at 18:36:00


This September, I will be having a great Texas adventure, and you're invited along. 

September 8-10, Austin: DjangoCon.US.   I will be presenting "PostgreSQL Performance in 15 Minutes" on the 9th. 

September 9th or 10th, Austin: I will speak at AustinPUG.  Date, location and exact topic still TBA.  Follow the AustinPUG Meetup, or check back here in a week for an update.

September 16-18, Dallas: Postgres Open:  I will be doing my Replication 101 tutorial, and then Explaining Explain.

September is a great time to be in Texas: temperatures are reasonable, Texans are friendly, and there's lots of great Mexican food and barbecue.  So, register for DjangoCon and/or Postgres Open today and join me!

Oh, and the photo? That's one of the hand-thrown DjangoCon pint cups I'm making as speaker gifts for DjangoCon.  Don't you wish you'd submitted a talk, now?
Posted by Hadi Moshayedi in CitusData on 2015-07-29 at 08:03:01

Citus Data is excited to announce the release of cstore_fdw 1.3, which is available on GitHub at github.com/citusdata/cstore_fdw. cstore_fdw is an open source columnar store extension for PostgreSQL created by Citus Data which reduces the data storage footprint and disk I/O for PostgreSQL databases.

cstore_fdw Changes

The changes in this release include:

  • ALTER FOREIGN TABLE ADD/DROP COLUMN ... support. You can now add/drop columns on existing cstore tables. Default values on newly added columns are also supported with some restrictions.
  • Improved COPY FROM support. Column references in the COPY command is now supported. You can now specify list of columns to be copied from a file to a cstore table.
  • Query performance improvements. Table row count estimation works better which allows the query planner to better prepare query plans.
  • (BugFix) Whole row references. You can now use row references in function calls like SELECT to_json(*) FROM cstore_table.
  • (BugFix) Insert concurrency. Deadlock issue during concurrent insert is resolved.

For installation and update instructions, you can download the cstore_fdw Quick Start Guide from our website or you can view installation and update instructions on the cstore_fdw page on GitHub.

To learn more about what’s coming up for cstore_fdw, see our development roadmap.

Got questions?

If you have questions about cstore_fdw, please contact us using the cstore-users Google group.

If you discover an issue when using cstore_fdw, please submit it to cstore_fdw’s issue tracker on GitHub.

Further information about cstore_fdw is available on our website where you can also find information on ways to contact us with questions.

Read more...

Posted by Josh Berkus in pgExperts on 2015-07-28 at 16:52:00
If you've been following the tech news, you might have noticed that we have a new open source PostgreSQL fork called "PipelineDB".  Since I've joined the advisory board of PipelineDB, I thought I'd go over what it is, what it does, and what you'd use it for.  If you're not interested in Pipeline, or Postgres forks, you can stop reading now.

PipelineDB is a streaming database version of PostgreSQL.  The idea of a streaming database, first introduced in the PostgreSQL fork TelegraphCQ back in 2003, is that queries are run against incoming data before it is stored, as a kind of stream processing with full query support.  If the idea of a standard database is "durable data, ephemeral queries" the idea of a streaming database is "durable queries, ephemeral data".  This was previously implemented in StreamBase, StreamSQL, and the PostgreSQL fork Truviso. In the Hadoop world, the concept is implemented in Apache SparkSQL.

On a practical level, what streaming queries do is allow you to eliminate a lot of ETL and redundant or temporary storage.

PipelineDB 0.7.7 is 100% of PostgreSQL 9.4, plus the ability to create Continuous Views, which are actually standing queries which produce different data each time you query them depending on the incoming stream.  The idea is that you create the queries which filter and/or summarize the data you're looking for in the stream, and store only the data you want to keep, which can go in regular PostgreSQL tables.

As an example of this, we're going to use PipelineDB to do tag popularity counts on Twitter.  Twitter has a nice streaming API, which gives us some interesting stream data to work with.  First I spun up a PipelineDB Docker container.  Connecting to it, I created the "twitter" database and a static stream called "tweets":

Creating a static stream isn't, strictly speaking, necessary; you can create a Continuous View without one.  As a career DBA, though, implied object names give me the heebie-jeebies.  Also, in some future release of PipelineDB, static streams will have per

[continue reading]

Posted by Federico Campoli on 2015-07-27 at 07:07:00
After a while I finally started a PostgreSQL Meetup in Brighton.
I reckon there are several people interested in PostgreSQL in the UK and I've met many of them in the Sussex area.

I'm quite new to the Meetup platform so I still need to learn how to use it correctly. However, anybody is absolutely welcome to help to organise a PostgreSQL community in the area.

The link is here. http://www.meetup.com/Brighton-PostgreSQL-Meetup/
I've already scheduled a PostgreSQL beer, an informal chit chat about our database of choice and the future of the community.

Save the date,

Friday, August 7, 2015 6:30 PM.

The location is still TBA. But any nice pub in Brighton should  be ok. Any suggestion is very welcome ;)

 

We’re finally at the end of the 10-part PGDB (PostgreSQL) performance series I use to initiate new developers into the database world. To that end, we’re going to discuss something that affects everyone at one point or another: index criteria. Or to put it another way:

Why isn’t the database using an index?

It’s a fairly innocuous question, but one that may have a surprising answer: the index was created using erroneous assumptions. Let’s explore what happens in a hospital environment with a pared-down table of patients.

DROP TABLE IF EXISTS sys_patient;
 
CREATE TABLE sys_patient
(
    patient_id  SERIAL   NOT NULL,
    full_name   VARCHAR  NOT NULL,
    birth_dt    DATE     NOT NULL,
    sex         CHAR     NOT NULL
);
 
INSERT INTO sys_patient (full_name, birth_dt, sex)
SELECT 'Crazy Person ' || a.id,
       CURRENT_DATE - (a.id % 100 || 'y')::INTERVAL
                    + (a.id % 365 || 'd')::INTERVAL,
       CASE WHEN a.id % 2 = 0 THEN 'M' ELSE 'F' END
  FROM generate_series(1, 1000000) a(id);
 
ALTER TABLE sys_patient ADD CONSTRAINT pk_patient_id
      PRIMARY KEY (patient_id);
 
CREATE INDEX idx_patient_birth_dt ON sys_patient (birth_dt);
CREATE INDEX idx_patient_sex ON sys_patient (sex);
 
ANALYZE sys_patient;

This particular hospital has a few queries that operate based on the sex of the patient, so someone created an index on that column. One day, another developer is doing some code refactoring and, being well-trained by the resident DBA, runs the query through EXPLAIN to check the query plan. Upon seeing the result, the dev curses a bit, tries a few variants, and ultimately takes the issue to the DBA.

This is what the developer saw:

EXPLAIN ANALYZE 
SELECT *
  FROM sys_patient
 WHERE sex = 'F';
 
                             QUERY PLAN                             
--------------------------------------------------------------------
 Seq Scan ON sys_patient  
      (cost=0.00..19853.00 ROWS=498233 width=29)
      (actual TIME=0.018..541.738 ROWS=500000 loops=1)
   FILTER: (sex = 'F'::bp

[continue reading]

Posted by Nikhil Sontakke on 2015-07-24 at 13:46:01

Why PostgreSQL? Why not MySQL, or the other shiny glossy NoSQL offerings out there? We often get asked these questions while interacting with customers who want to know where exactly will their data be stored in our cloud offering.

PostgreSQL Logo - SecureDB - Encryption as a Service

It’s a valid question. Because we at SecureDB are in the business of storing sensitive customer data (like files, objects, or alphanumeric data such as SSN, date of birth, address, race, username, password etc). Here are the top reasons why we picked PostgreSQL to build our encrypted data store.

 

ACID is (still) the acid test
We store sensitive data. So, we needed an Enterprise grade DB that does not compromise on data integrity and consistency. Ever! PostgreSQL is that database – rock solid, open source, enterprise class. We love ACID when it comes to our data and PostgreSQL gives us that.

 

Postgres has seen a surge in its popularity and adoption, particularly in the last few years. There’s a great demand for PostgreSQL from all quarters, be it large enterprises, startups, financial companies or governmental agencies, not just for the great cost-savings that it might bring to the table but also because of its stability, maturity and features.

Postgres pushes past MySQL in developer hearts - TechRepublic Article - SecureDB - Encryption as a Service

Source: http://www.techrepublic.com/article/postgres-pushes-past-mysql-in-developer-hearts/

Super Extensible
PostgreSQL is very extensible. One can introduce new data types, new procedural languages, new functions, and even new operators. And if you were to compare Postgres vis a vis the competition, then it has a better inbuilt feature set. The recent few releases have seen a greater push towards enterprise scalability and performance. And along with new features like BDR and sister projects for distributed multi-master offerings like postgres-xl, things are only going to get better.

 

Licensed to Kill
Another advantage, especially from a startup’s point of view, is the liberal licensing model of PostgreSQL. It’s a BSD or MIT type of license, which allows you to do ANY kinds of modifications without any indemnification worries at all. Fo

[continue reading]

Posted by Gulcin Yildirim in 2ndQuadrant on 2015-07-24 at 10:47:09
Posted by Michael Paquier on 2015-07-24 at 05:24:38

Docker is well-known, is used everywhere, is used by everybody and is a nice piece of technology, there is nothing to say about that.

Now, before moving on with the real stuff, note that for the sake of this ticket all the experiments done are made on a Raspberry PI 2, to increase a bit the difficulty of the exercise and to grab a wider understanding of how to manipulate Docker containers and images at a rather low level per the reasons of the next paragraph.

So, to move back to Docker... It is a bit sad to see that there are not many container images based on ARM architectures even if there are many machines around. And also, the size of a single container image can reach easily a couple of hundred megabytes in its most simple shape (it does not change the fact that some of those images are very popular, so perhaps the author of this blog should not do experimentations on such small-scale machines to begin with).

Not all the container images are that large though, there is for example one container based on the minimalistic distribution Alpine Linux, with a size of less than 5MB. Many packages are available as well for it so it makes it a nice base image for more extended operations. Now, the fact is that even if Alpine Linux does publish deliverables for ARM, there are no Docker container around that make use of it, and trying to use a container image that has been compiled for example x86_64 would just result on an epic failure.

Hence, extending a bit a script from the upstream Docker facility of Alpine Linux, it is actually easily possible to create from scratch a container image able to run on ARM architectures (the trick has been to consider the fact that Alpine Linux publishes its ARM deliverables with the alias armhf). Note in any case the following things about this script: - root rights are needed - ARM environment needs to be used to generate an ARM container - the script is here Roughtly, what this script does is fetching a minimal base image of Alpine Linux and then importing it in an image using "d

[continue reading]

Posted by Josh Berkus in pgExperts on 2015-07-23 at 20:46:00
By now you've read about PostgreSQL 9.5 and our shiny new UPSERT feature.  And you're thinking, "hey, 9.5 is still in alpha, how do I get me some UPSERT action now?"  While there are some suggestions for workarounds on the PostgreSQL wiki, I'm going to show you one more method for approximating UPSERT while you help us test 9.5 so that it will come out sooner.

The general suggested way for handling UPSERT situations is one of the following:
  1. Try to do an INSERT.  On error, have the application retry with an UPDATE.
  2. Write a PL/pgSQL procedure which does insert-then-update or update-then-insert.
Both of these approaches have the drawback of being very high overhead: the first involves multiple round-trips to the database and lots of errors in the log, and the second involves major subtransaction overhead.  Neither is concurrency-safe, but then the method I'm about to show you isn't either.  At least this method avoids a lot of the overhead, though.

What's the method?  Using writeable WITH clauses.  This feature, introduced in 9.1, allows you to do a multi-step write transaction as a single query.  For an example, let's construct a dummy table with a unique key on ID and a value column, then populate it:

     create table test_upsert ( id int not null primary key, 
        val text );
     insert into test_upsert select i, 'aaa'
       from generate_series (1, 100) as gs(i);


Now, let's say we wanted to update ID 50, or insert it if it doesn't exist.  We can do that like so:

    WITH
    newrow ( id, val ) as (
        VALUES ( 50::INT, 'bbb'::TEXT ) ),
    tryupdate as (
        UPDATE test_upsert SET val = newrow.val
        FROM newrow
        WHERE test_upsert.id = newrow.id
        RETURNING test_upsert.id
    )
    INSERT INTO test_upsert
    SELECT id, val
        FROM newrow
    WHERE id NOT IN ( SELECT id FROM tryupdate );


The above tries to update  ID=50.  If no rows are updated, it inserts them.  This also works for multiple rows:

    WITH
    newrow ( id, val ) as (
        VALUES ( 75::INT, 'cc

[continue reading]

pgchem::tigress can generate molecular formulae like C3H6NO2- from chemical structures.

But what if we need access to the elemental composition as a relation, e.g:

element count
C 3
N 1
O 2

Fortunately, PostgreSQL is awesome:

CREATE OR REPLACE FUNCTION elemental_composition(molformula TEXT)
  RETURNS TABLE(element TEXT, count INTEGER) AS
$BODY$
DECLARE token TEXT[];
DECLARE elements TEXT[];
BEGIN
elements := ARRAY['C','N','O','P','S','Cl'];
molformula := REPLACE(REPLACE(molformula,'-',''),'+','');

FOREACH element IN ARRAY elements LOOP
count := 1;
token := REGEXP_MATCHES(molformula, element || '[\d?]*');

IF (token[1] IS NOT NULL) THEN
    token :=
REGEXP_MATCHES(token[1],'[0-9]+');
        IF (token[1] iS NOT NULL) THEN
            count := token[1]::INTEGER;
        END IF;
RETURN NEXT;
END IF;
END LOOP;
RETURN;
END;
$BODY$
  LANGUAGE plpgsql IMMUTABLE STRICT
  COST 1000;


SELECT * FROM elemental_composition('C3H6NO2-');

And that's it. Did I already mention that PostgreSQL is awesome? :-)

As a member of the PGConf Silicon Valley Conference Committee, I have been extremely happy with the volume and quality of the talks submitted to the conference. The Committee has been working hard on sorting through the talks, and I am pleased to announce the first 5 of the 24 total breakout sessions:

  • Grant McAlister, Senior Principal Engineer for Amazon.com, on “Cloud Amazon RDS for PostgreSQL - What’s New and Lessons Learned”
  • Kenny Gorman, CTO of Data for Rackspace, on “Cloud PostgreSQL Automation Management with Ansible”
  • Magnus Hagander, Database Architect, Systems Administrator & Developer for Redpill Linpro, on “What’s New in PostgreSQL 9.5”
  • Ryan Lowe, Production Engineer at Square, on “Postgres for MySQL DBAs”
  • Matthew Kelly, In House Postgres Expert for TripAdvisor, on “At the Heart of a Giant: Postgres at TripAdvisor”

PGConf Silicon Valley is November 17-18, 2015 at the South San Francisco Conference Center. It is a technical conference aimed at the local Silicon Valley PostgreSQL community and is an opportunity for leading industry experts and the local PostgreSQL community to discuss and learn about the major new capabilities of PostgreSQL 9.4 (and 9.5!) and how to optimize a PostgreSQL environment.

If you plan to attend, Super Saver pricing is available through July 25, 2015. You can reserve your seat now by visiting the conference website.

In addition to great talks, we're also pleased to see great support from sponsors with Platinum level sponsor 2ndQuadrant joined by additional sponsors EnterpriseDB, VividCortex, PostgreSQL Experts, and Consistent State. We're also happy to welcome our first two media sponsors, Database Trends & Applications and Datanami.

We hope you will register now to attend the conference at the highly discounted Super Saver rates, which end on July 25, 2015. See you in November!

Read more...

Posted by Julien Rouhaud in Dalibo on 2015-07-22 at 10:48:16

Have you ever wished to know what configuration changed during the last weeks, when everything was so much faster, or wanted to check what happened on your beloved cluster while you were in vacation?

pg_track_settings is a simple, SQL only extension that helps you to know all of that and more very easily. As it’s designed as an extension, it requires PostgreSQL 9.1 or more.

Some insights

As amost any extension, you have to compile it from source, or use the pgxn client, since there’s no package yet. Assuming you just extract the tarball of the release 1.0.0 with a typical server configuration:

$ cd pg_track_settings-1.0.0
$ sudo make install

Then the extension is available. Create the extension on the database of your choice:

postgres=# CREATE EXTENSION pg_track_settings ;
CREATE EXTENSION

In order to historize the settings, you need to schedule a simple function call on a regular basis. This function is the pg_track_settings_snapshot function. It’s really cheap to call, and won’t have any measurable impact on your cluster. This function will do all the smart work of storing all the parameters that changed since the last call.

For instance, if you want to be able to know what changed on your server within a 5 minutes accuracy, a simple cron entry like this for the postgres user is enough:

*/5 *  * * *     psql -c "SELECT pg_track_settings_snapshot()" > /dev/null 2>&1

A background worker could be used on PostgreSQL 9.3 and more, but as we only have to call one function every few minutes, it’d be overkill to add one just for this. If you really want one, you’d better consider settting up PoWA for that, or another extension that allows to run task like pgAgent.

How to use it

Let’s call the snapshot function to get ti initial values:

postgres=# select pg_track_settings_snapshot()
 ----------------------------
  t
  (1 row)

A first snapshot with the initial settings values is saved. Now, I’ll just change a setting in the postgresql.conf file (ALTER SYSTEM could also be used on a PostgreSQL 9.4 or more rele

[continue reading]

Posted by Federico Campoli on 2015-07-22 at 06:40:00

INSERT

The INSERT statement is composed by two elements, the attribute list and the values list. The lists are positional. The formatting should mark clearly the various elements of the query in order to make it simpler to spot the correspondence between the attribute and the value.

  • After the words INSERT INTO there is one space followed by the table's name
  • After the table's name there is one carriage return
  • The attributes if present are enclosed between round brackets indented one tab from the word INSERT
  • The attribute's list indents one tab from the round brackets
  • The attributes indent one tab fro the round brackets and are separated by a carriage return
  • The word VALUES indents at the same level of INSERT and is followed by a carriage return
  • The value's list is surrounded by round brackets indented one tab from the word VALUE
  • The values indent one tab fro the round brackets and are separated by a carriage return
























UPDATE

The WHERE condition on the update is the same of the SELECT. Take a look to 1.1.1 for more informations. Just after the table's name there is the keyword SET and the list of the updates. The formatting should reflect this particular structure.
  • After the table's name there is one carriage return
  • The word SET indents one tab from the word UPDATE
  • The update's list indents one tab from the word set
  • Each update is separated by a carriage return 












DELETE

The delete is the simplest of the DML as lists just the target relation and the eventual WHERE condition. The formatting is quite simple as the where condition is the only part which requires to be structured as seen in the select.








Data Definition Language

The data definition language groups all the commands defining, altering or dropping the database's relations. Each command type follows a different syntax structure.

CREATE

The create statement is composed by the verb CREATE followed by the object type (e.g. TABLE) and the new relation's name followed by a carriage return. The rest of the statemen

[continue reading]

On 3rd of July, Tom Lane committed patch: Add psql \ev and \sv commands for editing and showing view definitions.   These are basically just like the \ef and \sf commands for functions.   Petr Korobeinikov, reviewed by Jeevan Chalke, some changes by me It's not a huge patch, but it's the first patch for […]
Posted by Ernst-Georg Schmid on 2015-07-21 at 11:04:00
1st impression of pg_shard:

  1. It works as described, documented limitations apply
  2. Status '3' of a shard_placement means inactive, '1' means online
  3. Read the issues list - unless you want to discover them yourselves :-)
  4. If a whole worker node goes offline this may go unnoticed, since there seems to be no heartbeat between the head and worker nodes unless you try to write data
PostgreSQL 9.5 is just around the corner and many cool new features have been added to this wonderful release. One of the most exciting ones is definitely SKIP LOCKED. To make sure that concurrent operations don’t lead to race conditions, SELECT FOR UPDATE has been supported for many years now and it is essential to […]
Posted by Julien Rouhaud in Dalibo on 2015-07-20 at 10:48:16

Have you ever wished to know what configuration changed during the last weeks, when everything was so much faster, or wanted to check what happened on your beloved cluster while you were in vacation?

pg_track_settings is a simple, SQL only extension that helps you to know all of that and more very easily. As it’s designed as an extension, it requires PostgreSQL 9.1 or more.

Some insights

As amost any extension, you have to compile it from source, or use the pgxn client, since there’s no package yet. Assuming you just extract the tarball of the release 1.0.0 with a typical server configuration:

$ cd pg\_track\_settings-1.0.0
$ sudo make install

Then the is available. Create the extension on the database of your choice:

postgres=# CREATE EXTENSION pg\_track\_settings ;
CREATE EXTENSION

In order to historize the settings, you need to schedule a simple function call on a regular basis. This function is the pg_track_settings_snapshot function. It’s really cheap to call, and won’t have any measurable impact on your cluster. This function will do all the smart work of storing all the parameters that changed since the last call.

For instance, if you want to be able to know what changed on your server within a 5 minutes accuracy, a simple cron entry like this for the postgres user is enough:

\*/5 \*  \* \* \*     psql -c "SELECT pg\_track\_settings\_snapshot()" > /dev/null 2>&1

A background worker could be used on PostgreSQL 9.3 and more, but as we only have to call one function every few minutes, it’d be overkill to add one just for this. If you really want one, you’d better consider settting up PoWA for that, or another extension that allows to run task like pgAgent.

How to use it

Let’s call the snapshot function to get ti initial values:

postgres=# select pg\_track\_settings\_snapshot()
 ----------------------------
  t
  (1 row)

A first snapshot with the initial settings values is saved. Now, I’ll just change a setting in the postgresql.conf file (ALTER SYSTEM could also be used on a PostgreSQL 9.4 or more

[continue reading]

An easy way to give PGDB (PostgreSQL) a performance boost is to judiciously use indexes based on queries observed in the system. For most situations, this is as simple as indexing columns that are referenced frequently in WHERE clauses. PGDB is one of the few database engines that takes this idea even further with partial indexes. Unfortunately as a consequence of insufficient exposure, most DBAs and users are unfamiliar with this extremely powerful functionality.

Imagine we have an order system that tracks order state, such that entries are marked as new, processing, or done. These kinds of transient states are not uncommon in various inventory management systems, so it’s a great example for this use case. Often with such systems, data is distributed in such a way that more than 90% of orders are marked as ‘done’. To make this interesting, let’s just cap the done state at 90%, and distribute another 5% to processing, and 5% to new.

This somewhat complex SQL should emulate the above scenario:

DROP TABLE IF EXISTS sys_order;
 
CREATE TABLE sys_order
(
    order_id     SERIAL       NOT NULL,
    account_id   INT          NOT NULL,
    product_id   INT          NOT NULL,
    item_count   INT          NOT NULL,
    order_state  CHAR         NOT NULL DEFAULT 'N',
    order_dt     TIMESTAMPTZ  NOT NULL DEFAULT now(),
    valid_dt     TIMESTAMPTZ  NULL
);
 
INSERT INTO sys_order (
         account_id, product_id, item_count, order_state,
         order_dt, valid_dt
       )
SELECT (a.id % 100) + 1, (a.id % 100000) + 1, (a.id % 100) + 1,
       CASE WHEN b.id BETWEEN 1 AND 5 THEN 'N'
            WHEN b.id BETWEEN 6 AND 10 THEN 'P'
            ELSE 'D'
       END,
       now() - (a.id % 1000 || 'd')::INTERVAL
             + (a.id || 'ms')::INTERVAL,
       CASE WHEN a.id % 499 = 0
            THEN NULL
            ELSE now() - (a.id % 999 || 'd')::INTERVAL
       END
  FROM generate_series(1, 10000) a(id),
       generate_series(1, 100) b(id);
 
ALTER TABLE sys_order ADD CONSTRAINT pk_order_id
      PRIMARY KEY (orde

[continue reading]

Posted by Pavel Stehule on 2015-07-16 at 04:08:00
When you run psql without arguments, then the psql tries to connect to database named like current user. I dislike this behave. Surely I have not database "pavel". There is trivial solution - add to your .bashrc code:
function psql { 
if [[ $# -eq 0 ]]; then
env psql postgres
else
env psql "$@"
fi
}

When: 7-8pm Wed July 22
Where: Oregon Convention Center, Room E141

We’re having a Birds of a Feather session at OSCON instead of our usual July meeting.

Registration is not required to attend the BoF. O’Reilly would prefer we register, though – here’s a code for a free pass if you would like to also check out the Expo Hall: https://twitter.com/oscon/status/621013720257269760


Posted by Greg Sabino Mullane in End Point on 2015-07-15 at 18:36:00

Being able to disable Postgres triggers selectively can be an important skill when doing tasks like bulk updates, in which you only want a subset of the triggers on the table to be fired. Read below for the long explanation, but the TL;DR version of the best solution is to set a WHEN clause on the trigger you wish to skip, making it conditional on a variable such as session_replication_role, or application_name


CREATE TRIGGER mytrig AFTER INSERT ON foobar FOR EACH 
  ROW WHEN (current_setting('session_replication_role') <> 'local') EXECUTE PROCEDURE myfunc();
BEGIN;
SET LOCAL session_replication_role = 'local';
UPDATE foobar SET baz = 123;
COMMIT;

I decided to spin up a free Heroku "Hobby Dev" database to illustrate the solutions. Generating a test table was done by using the Pagila project, as it has tables which contain triggers. Heroku gives you a randomly generated user and database name. To install the Pagila schema, I did:

$ export H="postgres://vacnvzatmsnpre:2iCDp-46ldaFxgdIx8HWFeXHM@ec2-34-567-89.compute-1.amazonaws.com:5432/d5q5io7c3alx9t"
$ cd pagila-0.10.1
$ psql $H -q -f pagila-schema.sql
$ psql $H -q -f pagila-data.sql

Errors appeared on the import, but they can be safely ignored. One error was because the Heroku database does not have a user named "postgres", and the other error was due to the fact that the Heroku user is not a superuser. The data, however, was all intact. The sample data is actually quite funny, as the movie titles were semi auto-generated at some point. For example, seven random movie descriptions:

  • A Brilliant Panorama of a Madman And a Composer who must Succumb a Car in Ancient India
  • A Touching Documentary of a Madman And a Mad Scientist who must Outrace a Feminist in An Abandoned Mine Shaft
  • A Lackluster Reflection of a Eskimo And a Wretch who must Find a Fanny Pack in The Canadian Rockies
  • A Emotional Character Study of a Robot And a A Shark who must Defeat a Technical Writer in A Manhattan Penthouse
  • A Amazing Yarn of a Hunter And a Butler

[continue reading]

Posted by Andrew Dunstan in pgExperts on 2015-07-15 at 13:07:00
Say you have a partitioned table and you want to add a column. There is no problem - you just add the column to the parent table, and it is added to all the children. But what if you want to drop a column? Then things are no so straightforward. If the child's column was created before it was inherited then it won't be dropped just by dropping it on the parent. So it very much depends on how the child is set up. If you do:
create table child() like (parent);
then dropping a column in the parent drops it in the child too. But if you do:
create table child (like parent);
alter table child inherit parent;
then dropping a column in the parent won't drop it in the child. The pg_partman package follows this second pattern when setting up child partitions, as I discovered yesterday when a client ran into this problem. In this case you have to delete the column from the children yourself. I devised the following snippet of code to accomplish this after you have deleted the column from the parent:
do $$
declare
child oid;
begin
for child in
select inhrelid
from pg_inherits
where inhparent = 'parent'::regclass
loop
execute 'alter table ' || child::regclass ||
' drop column if exists some_column';
end loop;
end;
$$;

PostgreSQL 9.5 is coming up with a new feature aimed at simplifying tracking of GUC parameters when those are set in a multiple set of files by introducing a new system view called pg_file_settings:

commit: a97e0c3354ace5d74c6873cd5e98444757590be8
author: Stephen Frost <sfrost@snowman.net>
date: Fri, 8 May 2015 19:09:26 -0400
Add pg_file_settings view and function

The function and view added here provide a way to look at all settings
in postgresql.conf, any #include'd files, and postgresql.auto.conf
(which is what backs the ALTER SYSTEM command).

The information returned includes the configuration file name, line
number in that file, sequence number indicating when the parameter is
loaded (useful to see if it is later masked by another definition of the
same parameter), parameter name, and what it is set to at that point.
This information is updated on reload of the server.

This is unfiltered, privileged, information and therefore access is
restricted to superusers through the GRANT system.

Author: Sawada Masahiko, various improvements by me.
Reviewers: David Steele

In short, pg_file_settings can prove to be quite useful when using a set of configuration files to set the server when including them using for example include or include_if_not_exists. Hence, for example let's imagine a server with the following, tiny configuration:

$ cat $PGDATA/postgresql.conf
shared_buffers = '1GB'
work_mem = '50MB'
include = 'other_params.conf'
$ cat $PGDATA/other_params.conf
log_directory = 'pg_log'
logging_collector = on
log_statement = 'all'

Then this new system view is able to show up from where each parameter comes from and the value assigned to it:

=# SELECT * FROM pg_file_settings;
          sourcefile          | sourceline | seqno |       name        | setting | applied | error
------------------------------+------------+-------+-------------------+---------+---------+-------
 /to/pgdata/postgresql.conf   |          1 |     1 | shared_buffers    | 1GB     | t       | null
 /to/pgdata/postgresql.conf   |    

[continue reading]