PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Michael Paquier in VMware on 2014-10-31 at 00:52:34

Behind this eye-catching title is an extension called blackhole that I implemented yesterday, tired of needing to always structure a fresh extension when needing one (well copying one from Postgres contrib/ would be fine as well). Similarly to blackhole_fdw that is aimed to be an extension for a foreign-data wrapper, blackhole is an extension wanted as minimalistic as possible that can be used as a base template to develop a Postgres extension in C.

When using it for your own extension, simply copy its code, create a new git branch or whatever, and then replace the keyword blackhole by something you want in the code. Note as well that the following files need to be renamed:

blackhole--1.0.sql
blackhole.c
blackhole.control

Once installed in a vanilla state, this extension does not really do much, as it only contains a C function called blackhole, able to do the following non-fancy thing:

=# \dx+ blackhole
Objects in extension "blackhole"
  Object Description
----------------------
 function blackhole()
(1 row)
=# SELECT blackhole();
 blackhole
-----------
 null
 (1 row)

Yes it simply returns a NULL string.

The code of this template is available here, or blackhole/ with the rest of a set of PostgreSQL plugins managed in the repository pg_plugins. Hope that's useful (or not). In case, if you have ideas to improve it, feel free to send a pull request, but let's keep it as small as possible.

And Happy Halloween!

Posted by Josh Berkus in pgExperts on 2014-10-30 at 23:32:00
I will be in Seattle soon on business.  This will include two opportunties to do PostgreSQL community stuff:
  • A meetup with SEAPUG on the 12th, where I will talk about the upcoming 9.4 features (not on the calendar yet, hopefully it'll be fixed soon)
  • A BOF at Usenix LISA where I talk about the same, only to a different crowd.
If you're in the Seattle area, come chat!  We'll go out to Dilletante and have chocolate!
Posted by Simon Riggs in 2ndQuadrant on 2014-10-30 at 09:28:00

Ben Bradlee, the former editor of the Washington Post died recently. A famous speech of his from 1997 contains some words that mean something for me. It starts like this

"Newspapers don't tell the truth under many different, and occasionally innocent, scenarios. Mostly when they don't know the truth. Or when they quote someone who does not know the truth.

And more and more, when they quote someone who is spinning the truth, shaping it to some preconceived version of a story that is supposed to be somehow better than the truth, omitting details that could be embarrassing.

And finally, when they quote someone who is flat out lying...."

and summarises with

"Where lies the truth? That's the question that pulled us into this business, as it propelled Diogenes through the streets of Athens looking for an honest man. The more aggressive our serach for the truth, the more people are offended by the press. The more complicated are the issues and the more sophisticated are the ways to disguise the truth, the more aggressive our search for the truth must be, and the more offensive we are sure to become to some. So be it."

before ending

"I take great strength from that now, knowing that in my experience the truth does emerge. It takes forever sometimes, but it does emerge. And that any relaxation by the press will be extremely costly to democracy."

Who would have that that his words apply so well to PostgreSQL and especially to the cost of data integrity? Yes, referential integrity does require additional performance to make it work right, but how else can we be sure that we are passing valid data around? Surely the purpose of a database needs to be primarily a home for the truth, verified to be so by cross checks and constraints.

As of today, Continuent is part of VMware. We are absolutely over the moon about it.

You can read more about the news on the VMware vCloud blog by Ajay Patel, our new boss. There’s also an official post on our Continuent company blog. In a nutshell the Continuent team is joining the VMware Cloud Services Division. We will continue to improve, sell, and support our Tungsten products and work on innovative integration into VMware’s product line.

So why do I feel exhilarated about joining VMware? There are three reasons. 

1.     Continuent is joining a world-class company that is the leader in virtualization and cloud infrastructure solutions. Even better, VMware understands the value of data to businesses. They share our vision of managing an integrated fabric of standard DBMS platforms, both in public clouds as well as in local data centers. It is a great home to advance our work for many years to come.

2.     We can continue to support our existing users and make Tungsten even better. I know many of you have made big decisions to adopt Continuent technology that would affect your careers if they turned out badly. We now have more resources and a mandate to grow our product line. We will be able to uphold our commitments to you and your businesses.

3.     It’s a great outcome for our team, which has worked for many years to make Continuent Tungsten technology successful. This includes our investors at Aura in Helsinki, who have been dogged in their support throughout our journey.

Speaking of the Continuent team…I am so proud of what all of you have achieved. Today we are starting a new chapter in our work together. See you at VMware!
Posted by Hans-Juergen Schoenig in Cybertec on 2014-10-29 at 09:58:30
PostgreSQL has offered support for powerful analytics and windowing for a couple of years now already. Many people all around the globe use analytics to make their applications more powerful and even faster. However, there is a small little feature in the area of analytics which is not that widely known. The power to use composite […]
Author
Andreas 'ads' Scherbaum

FOSDEM PGDay is a one day conference that will be held ahead of FOSDEM in Brussels, Belgium, on Jan 3th, 2015. This will be a one-day focused PostgreSQL event, with a single track of talks. This conference day will be for-charge and cost 50€, and will be held at the Brussels Marriott Hotel. Registration is required to attend, the registration is open. Since we have a limited number of seats available for this event, we urge everybody to register as soon as possible once open.

PostgreSQL Europe will also have our regular devroom at FOSDEM on Saturday the 31st, which will be held at the main FOSDEM venue at ULB. This day will, of course, continue to be free of charge and open to all FOSDEM entrants. No registration is required to attend this day.

For full details about the conference, venue and hotel, see http://fosdem2015.pgconf.eu/.


The call for papers is now open for both these events. We are looking for talks to fill both these days with content for both insiders and new users. Please see http://fosdem2015.pgconf.eu/callforpapers/ for details and submission information.

The deadline for submissions is November 24th, 2014, but we may as usual pre-approve some talks, so get your submissions in soon!


We also have negotiated rate with the Brussels Marriott Hotel. For details, see http://fosdem2015.pgconf.eu/venue/.

I don't know how many times I have had to try to drum this into clients' heads. Having an up to date replica won't protect you against certain kinds of failures. If you really want to protect your data, you need to use a proper backup solution - preferable a continuous backup solution. The ones I prefer to use are barman and wal-e. Both have strengths and weaknesses, but both are incredibly useful, and fairly well documented and simple to set up. If you're not using one of them, or something similar, your data is at risk.

(In case you haven't guessed, today is another of those days when I'm called in to help someone where the master and the replica are corrupted and the last trusted pg_dump backup is four days old and rolling back to it would cost a world of pain. I like these jobs. They can stretch your ingenuity, and no two are exactly alike. But I'd still rather be paid for something more productive.)
Posted by torsten foertsch on 2014-10-27 at 20:16:00
After attending this year's pgconf.eu I decided to write my first postgres extension. I did this with 2 goals in mind. First of all, I wanted the extension. Secondly, I wanted to document how long it takes and how complicated it is.
A few words to my background, I am mainly a Perl programmer with a few years experience in programming C before I switched to Perl. Knowing C, I am not afraid to write Perl extensions in XS. Also, I am a contributor to the Apache mod_perl project which is also written in C. I know Postgres since version 6.4. But I haven't really delved into it until last year. Since then, I have learned a lot and even found a bug in postgres. The conference in Madrid was my second event of that type. Before I attended the PG conference in Dublin last year.

The idea

At work, we replicate databases over transcontinental links. Monitoring the replication lag is essential for us. With a stock PG 9.3, one can use the pg_stat_replication view on the master to get a picture how far the replicas lag behind. Usually, I use a command like this:

SELECT client_addr,
application_name,
state,
flush_location,
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),
flush_location)) as lag
FROM pg_stat_replication
ORDER BY 2, 1;

There are 2 points to watch out. First, the query must run on the master. This may or may not be a problem. Second, the lag output is in units of bytes. You may have a guess of the average bandwidth between master and replica. So, you might be able to derive an educated guess how long it will take for the lag to dissolve.
Anyway, at the conference I listened to a talk by Petr Jelinek about Uni Directional Replication. He mentioned to have measured the replication lag in units of time. Later I asked him how. It turned out he simply updated a special table with at timestamp. That way, records containing the time on the master are written to the WAL stream and replicated to the slave. Given both master and slave are

[continue reading]

Logical decoding is a superset of the existing standby protocol. Hence after decoding changes from WAL an output plugin can shape it in any kind of ways, making for example possible to have a plugin that does the exact revert operation of the decoding portion a PostgreSQL server instancedid by reproducing similar WAL records that could be replayed similarly to a standby. Not sure if this would be actually useful, but well that's possible...

One of the great things in this new 9.4 infrastructure is then the possibility to have a client receiving the logical changes able to let the PostgreSQL instance decoding the changes think that what receives the changes is itself a standby by having it using the replication protocol that vanilla streaming standbys use and is present since 9.0 for the asynchronous node, and 9,1 for the "synchronous" node (having master node wait for the commit confirmation from a standby) guaranteeing no loss of data after a commit. There are three things that are important to be aware of on the receiver side when looking for such a behavior with a logical receiver.

First, using the replication protocol is necessary to let the master node think that what is connected is a kind of standby. Extracting logical changes is possible as well with the set of dedicated functions called pg_logical_slot_peek_changes and pg_logical_slot_get_changes (and their binary equivalent), but do not count on that if you want to wait from the receiver that a change has been committed (abuse of term as this depends on how this receiver consumes those changes).

Second, a master node classifies the standbys by priority using the parameter synchronous_standby_names, the synchronous standby being the lowest one strictly higher than zero. So when using a receiver, be sure that it connects to the master node using application_name to give to it a proper identifier, resulting in a connection string similar to that with a minimum configuration:

dbname=my_logical_database replication=database application_name=my_receiver

[continue reading]

Slides from my talk, How we use PostgreSQL at Trustly, are now available.


Posted by Bruce Momjian in EnterpriseDB on 2014-10-24 at 20:45:01

PostgreSQL Conference Europe has just finished and I delivered two new presentations at the conference (a first for me). Postgres Scaling Opportunities summarizes scaling options, and Flexible Indexing with Postgres summarizes indexing options. I have been submitting these presentations to conferences for many months but this is the first conference to have chosen them. The talks were well attended and generated positive feedback.

The talks are more surveys rather than focussing on specific technologies. It might be beneficial for future conferences to have more survey-oriented talks, as many attendees are not Postgres experts.

I am now heading to Russia for two weeks, presenting in St. Petersburg and Moscow.

Continue Reading »

Slides from my talk, Finding and Repairing Database Corruption, are now available.

Up to now it was only possible to replicate entire database instances from one node to the other. A slave always had to consume the entire transaction log created by the master turning the slave into a binary copy. However, this is not always desirable. In many cases it is necessary to split the database […]
Posted by Josh Berkus in pgExperts on 2014-10-21 at 21:18:00
One of the things I mentioned in my series on VACUUM FREEZE was that we really needed a Postgres utility which would opportunistically freeze tables during low traffic periods. Today I'm announcing the Flexible Freeze project, our first attempt at designing such a utility.

All that's there right now is a simple Python script.  However, that script is already a useful tool, installed at multiple production sites. Here's how the script works:
  1. identify your active databases and daily/weekly low traffic periods.
  2. create a cron job which calls flexible_freeze.py with a time limit to keep it inside your low traffic window.
  3. flexible_freeze.py will loop through your tables with the oldest XIDs, freezing them until it runs out of time or out of tables
There is also a second mode, using the --vacuum switch, which does VACUUM ANALYZE on the tables with the most dead rows (according to pg_stat_user_tables).  This is to help users who have a strong high/low traffic cycle and want to make sure that regular vacuuming takes place during low traffic.  If you're running both modes, we advise doing the freeze first.

Of course, I have a tanker-truck full of desired improvements/expansions to this.  So, pull requests welcome. 

If you're more into Ruby, Wanelo has rewritten flexible freeze for Rails and incorporated it into their Postmodern tool.

I’m seeing reports of a number of issues with PostgreSQL after upgrades of OS X machines to Yosemite (OS X 10.10) that I’m concerned about, so I’m seeking more information about the experiences of PostgreSQL users who’ve done OS X 10.10 upgrades.

I can’t confirm anything yet, but back up all your databases before any upgrade to OS X 10.10. Just in case. (Of course, you do that before any upgrade, but just in case it slipped your mind this time…).

I don’t have access to a Mac because Apple’s policy prevents developers from running OS X for testing and development (or anything else) without buying physical Apple hardware and finding somewhere to run it. So I can’t test most of this myself, and I really need reports from users, or if possible, results of proactive testing by OS X users.

OS X built-in PostgreSQL deleted on update

Some OS X users appear to use the PostgreSQL version built-in to OS X for their own data, rather than installing a new PostgreSQL. Some of them, in addition to using the binaries, also use a PostgreSQL cluster (database instance) that’s created by OS X for the use of Server.app, instead of initdbing their own.

On releases prior to Yosemite the PostgreSQL provided by Apple was on the default PATH, though not necessarily running by default. It seems that on Yosemite it’s been removed; there’s no longer any /usr/bin/psql, etc. As far as I can tell Server.app now bundles PostgreSQL within the application bundle instead.

Some user reports suggest that on upgrade, the Apple-controlled databases in the new release are migrated into the new cluster managed by Server.app then the old cluster is stopped or possibly deleted – a colleage checked the upgrade script and found rm -rf /var/pgsql in it.

The PostgreSQL data directory in prior releases was /private/var/pgsql (and /var is a symlink to /private/var) or /Library/Server/PostgreSQL/Data.

The main symptom you’ll see is:

Connection refused
Is the server running locally and accepting
connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432

[continue reading]

Posted by robert berry on 2014-10-21 at 00:00:00

Monitoring Postgresql with a Background Worker

Oct 21, 2014 – Portland

pgantenna and pgsampler comprise an experimental Postgreqsql monitoring framework. This post explores how they work, and what problems they aim to solve.

Framework Overview

pgsampler is a background worker which collects data in a Postgresql cluster. It can log this data to CSV files or ship the metrics to a pgantenna instance over a tcp connection.

pgantenna is an application shipped as a Docker image which receives pgsampler data. It provides a web interface for live monitoring, checks for alerting conditions, and allows for psql access to a historical database of cluster metrics.

Motivation

There are a number of high quality monitoring and performance analysis tools for Postgresql. Many of these involve a remote service which connects to Postgresql as a regular client, or an application that parses log files.

The presented framework uses a background worker to ship statistics to a remote service. It aims to solve a grab bag of real or imagined problems discussed below. Of course, this approach presents it’s own problems and is thus best characterized as an experiment.

Live Monitoring

Data is sent from the cluster in a polling loop at second intervals. Different metrics can be tuned to desired sampling rates.

Using Postgres to Monitor Postgres

Dashboard plots and alert conditions are all written directly in SQL. For example, alert conditions are triggered whenever a cron-executed query returns a NULL in the first field in the first record. Plots are rendered with plotpg.

Historical Analysis with SQL

pgantenna provides a containerized remote cluster which stores historical data separate from transactional systems. The history is just a Postgresql database that can be queried with familiar tools.

Easy to Configure

The background worker uses direct access to identify and connect to databases automatically. Security concerns notwithstanding, this allows for very little configuration minutae to get started with comprehensive monitoring.

Close t

[continue reading]

Yesterday I had an interesting discussion on irc. A guy wanted to know why Perl script is causing problems when dealing with Pg and unicode characters. The discussion went sideways, I got (a bit) upset, and had to leave anyway, so I didn't finish it. But it did bother me, as for me the reasons […]
Posted by Mischa Spiegelmock on 2014-10-19 at 08:44:00

Like many others before and since, my introduction to the world of databases was via MySQL. It was a good enough database that did what I wanted, so I was content to use it without realizing what I was missing out on.

Comparing MySQL to PostgreSQL is a bit like comparing notepad.exe to emacs. While they both let you type in some text and save it to a file, that doesn’t quite tell the whole story. They are both RDBMSes that allow for the execution of queries via a client connection, but the philosophies, capabilities and sophistication of each are hardly even on the same plane of existence.

The way I think about it is that MySQL is a simple database program. Technically it does have pluggable engines; you can choose between a terrifically useless engine and one that at least knows what a transaction is. I consider Pg more of a “database framework”: an incredibly sophisticated and flexible program that provides a standard frontend interface, much as emacs ties everything together through “modes” and elisp, Pg has a C client library and speaks SQL.
To give you a more concrete example of what I mean, in Pg you can create “foreign data wrappers”, where the actual implementation of a query can be defined by a plugin. One example of a foreign wrapper is “MySQL" - you connect to Pg like normal through your client library and run a SQL query, but it’s actually executed on a remote MySQL server and presented as a table view in Pg. Or perhaps your query is auto-parallelized and distributed among many mongoDB nodes instead. No big deal.
Pg also has an astonishingly powerful set of extensions. My absolute favorite is PostGIS, a geospatial extension that adds geographic and geometric types, operators, indexed storage and much much more. Seeing a demo of it at the SFPUG meetups blew my mind, and I’ve been an avid user of it ever since.
Did I mention it also has a full-text search capability every bit as useful as Solr or ElasticSearch, with stemming and your choice of GIN or GiST indexes? My life is so much better now that I ca

[continue reading]

Following feature is less important for performance, but for somebody can be important for aesthetic reasons - now you can use a styles for unicode table borders. Possible styles are only two, but you can set a border, header and column style. It is a 6 combinations. Next you have a 3 styles for borders generally - so it together 18 possible combinations of psql table output:
postgres=# \pset unicode_header_linestyle double 
Unicode border linestyle is "double".
postgres=# \pset linestyle unicode
Line style is unicode.
postgres=# \l
List of databases
Name │ Owner │ Encoding │ Collate │ Ctype │ Access privileges
═══════════╪══════════╪══════════╪═════════════╪═════════════╪═══════════════════════
postgres │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │
template0 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │ =c/postgres ↵
│ │ │ │ │ postgres=CTc/postgres
template1 │ postgres │ UTF8 │ en_US.UTF-8 │ en_US.UTF-8 │ =c/postgres ↵
│ │ │ │ │ postgres=CTc/postgres
(3 rows)

postgres=# \l
List of databases
Name Owner Encoding Collate Ctype Access privileges
═════════ ════════ ════════ ═══════════ ═══════════ ═════════════════════
postgres postgres UTF8 en_US.UTF-8 en_US.UTF-8
template0 postgres UTF8 en_US.UTF-8 en_US.UTF-8 =c/postgres ↵
postgres=CTc/postgres
template1 postgres UTF8 en_US.UTF-8 en_US.UTF-8 =c/postgres ↵
postgres=CTc/postgres
(3 rows)


postgres=# \pset border 2
Border style is 2.
postgres=# \l
List of databases
┌───────────┬──────────┬──────────┬─────────────┬─────────────┬───────────────────────┐
│ Name │ Owner │ Encoding │ Collate │ Ctype │ Access privileges │
╞═══════════╪══════════╪══════════╪══════

[continue reading]

Posted by Feng Tian on 2014-10-17 at 18:22:00
We have just release our 9.3.5.S for public beta test.  Together with the product, we released a benchmark based on TPCH.   The modification to data types is easy to understand -- money and double types are faster than Numeric (and no one on this planet has a bank account that overflows the money type, not any time soon).   The modifications to queries are more interesting.

We modified the queries not because we want to show off how fast Vitesse DB is.  Without these modifications, some query will never finish.    We have seen similar queries, and similar modifications required in the field.   Overall, PostgreSQL is well capable of running the workload of TPCH as long as developers pay attention to some "tricks".

Now let's look at them,

Q2: The skeleton of Q2 look like
select xxx from t1, t2, ... where foo = (select min(bar) from tx, ty, .. where tx.field = t1.field ...);

This is correlated subquery (tx.field = t1.field) with aggregate.   You can pull the subquery out with a join,

select xxx from t1, t2, ...
          (select tx.field, min(bar) as min_bar from tx, ty ... group by tx.field) tmpt
where t1.field = tmpt.field and foo = min_bar ...

Performance of join (in this case, hash join, very fast) is two orders of magnitude faster than the subplan query.

Same trick is applied to Q17.

In the next post, I will examine Q20 which uses CTE (with clause).






The POODLE attack on https (the attack is about https, the vulnerability in SSL, an important distinction) has received a lot of media attention lately, so I figured a (very) short writeup was necessary.

The TL;DR; version is, you don't have to worry about POODLE for your PostgreSQL connections when using SSL.

The slightly longer version can be summarized by:

  • The PostgreSQL libpq client in all supported versions will only connect with TLSv1 and newer, which is not vulnerable.
  • The PostgreSQL server prior to the upcoming 9.4 version will however respond in SSLv3 (which is the vulnerable version) if the client insists on it (which a third party client can do).
  • To exploit POODLE, you need a client that explicitly does out-of-protocol downgrading. Something that web browsers do all the time, but very few other clients do. No known PostgreSQL client library does.
  • To exploit POODLE, the attacker needs to be able to modify the contents of the encrypted stream - it cannot be passively broken into. This can of course happen if the attacker can control parameters to a SQL query for example, but the control over the data tends to be low, and the attacker needs to already control the client. In the https attack, this is typically done through injecting javascript.
  • To exploit POODLE, there needs to be some persistent secret data at a fixed offset in each connection. This is extremely unlikely in PostgreSQL, as the protocol itself has no such data. There is a "cancel key" at the same location in each stream, but it is not reused and a new one is created for each connection. This is where the https attack typically uses the session cookie which is both secret and fixed location in the request header.

For a really good writeup on the problem, see this post from PolarSSL, or this one from GnuTLS.

Introduced in PostgreSQL 9.4, pg_recvlogical has the ability to control the creation of logical replication slots from which logical changes can be streamed. Note that in the case this is a mandatory condition when using logical decoding. pg_receivexlog does not have in 9.4 any control on the physical replication slots it may stream from (to ensure that the WAL segment files this utility is looking for are still retained on the server side). This feature has been added for 9.5 with the following commit:

commit: d9f38c7a555dd5a6b81100c6d1e4aa68342d8771
author: Andres Freund <andres@anarazel.de>
date: Mon, 6 Oct 2014 12:51:37 +0200
Add support for managing physical replication slots to pg_receivexlog.

pg_receivexlog already has the capability to use a replication slot to
reserve WAL on the upstream node. But the used slot currently has to
be created via SQL.

To allow using slots directly, without involving SQL, add
--create-slot and --drop-slot actions, analogous to the logical slot
manipulation support in pg_recvlogical.

Author: Michael Paquier

This simply introduces two new options allowing to create or drop a physical replication slot, respectively --create-slot and --drop-slot. The main difference with pg_recvlogical is that those additional actions are optional (not --start option introduced as well for backward-compatibility). Be careful of a couple of things when using this feature though. First, when a slot is created, stream of the segment files begins immediately.

$ pg_receivexlog --create-slot --slot physical_slot -v -D ~/xlog_data/
pg_receivexlog: creating replication slot "physical_slot"
pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)

The slot created can then be found in the system view pg_replication_slots.

=# select slot_name, plugin, restart_lsn from pg_replication_slots ;
   slot_name   | plugin | restart_lsn
---------------+--------+-------------
 physical_slot | null   | 0/1000000
(1 row)

Then, when dropping a slot, as process can stream nothing it exits immedia

[continue reading]

After being on the road to do PostgreSQL consulting for Cybertec for over a decade I noticed that there are a couple of ways to kill indexing entirely. One of the most favored ways is to apply functions or expressions on the column people want to filter on. It is a sure way to kill […]
When working with PostgreSQL you generally want to get information about slow queries. The usual approach is to set log_min_duration_statement to some low(ish) value, run your app, and then analyze logs. But you can log to many places – flat file, flat file on another disk, local syslog, remote syslog. And – perhaps, instead of […]
Posted by Josh Berkus in pgExperts on 2014-10-11 at 00:19:00
To accompany the New Index Bloat Query, I've written a New Table Bloat Query.  This also involves the launch of the pgx_scripts project on GitHub, which will include most of the "useful scripts" I talk about here, as well as some scripts from my co-workers.

The new table bloat query is different from the check_postgres.pl version in several ways:
  • Rewritten to use WITH statements for better maintainability and clarity
  • Conditional logic for old Postgres versions and 32-bit platforms taken out
  • Index bloat removed, since we have a separate query for that
  • Columns formatted to be more immediately comprehensible
In the course of building this, I found two fundamentally hard issues:
  1. Some attributes (such as JSON and polygon fields) have no stats, so those tables can't be estimated.
  2. There's no good way to estimate bloat for compressed (TOAST) attributes and rows.
Also, while I rewrote the query almost entirely, I am still relying on Greg's core math for estimating table size.  Comparing this with the results of pgstattuple, I'm seeing an error of +/- 20%, which is pretty substantial.  I'm not clear on where that error is coming from, so help improving the math is very welcome!

Results look like this:

  databasename | schemaname |   tablename   | pct_bloat | mb_bloat | table_mb   
--------------+------------+-------------------+-----------+----------+----------
members_2014 | public | current_member | 92 | 16.98 | 18.547
members_2014 | public | member_response | 87 | 17.46 | 20.000
members_2014 | public | archive_member | 84 | 35.16 | 41.734
members_2014 | public | survey | 57 | 28.59 | 50.188

pct_bloat is how much of the table (0 to 100) is estimated to be dead space.  MB_bloat is how many megabytes of bloat are estimated to exist.  Table_mb is the actual size of the table in megabytes.

The suggested criteria is to list tables which are either more than 50% bloat and bigger than 10MB, or more than 25% bloat and bigger than 1GB.  However, you should calibrate

[continue reading]

On 7th of October, Alvaro Herrera committed patch: Implement SKIP LOCKED for row-level locks   This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the […]

SKIP LOCKED is a new feature associated with row-level locking that has been newly-introduced in PostgreSQL 9.5 by this commit:

commit: df630b0dd5ea2de52972d456f5978a012436115e
author: Alvaro Herrera <alvherre@alvh.no-ip.org>
date: Tue, 7 Oct 2014 17:23:34 -0300
Implement SKIP LOCKED for row-level locks

This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro

Let's take for example the simple case of the following table that will be locked:

=# CREATE TABLE locked_table AS SELECT generate_series(1, 4) as id;
SELECT 1

Now a session is taking a shared lock on the row created of locked_table, taking the lock within a transaction block ensures that it will still be taken for the duration of the tests.

=# BEGIN;
BEGIN
=# SELECT id FROM locked_table WHERE id = 1 FOR SHARE;
 id
----
  1
(1 row)

Now, the shared lock prevents any update, delete or even exclusive lock from being taken in parallel. Hence the following query will wait until the transaction of previous session finishes. In this case this query is cancel by the user (note that error message tells for which row this query was waiting for):

=# SELECT * FROM locked_table WHERE id = 1 FOR UPDATE;
^CCancel request sent
ERROR:  57014: canceling statement due to user request
CONTEXT:  while locking tuple (0,1) in relation "locked_table"
LOCATION:  ProcessInterrupts, postgres.c:2966

There is already one way to bypass this w

[continue reading]

Posted by Mark Wong in 2ndQuadrant on 2014-10-10 at 00:35:01

This another simple test in continuation from last time. We will start with the same lineitem table as in the previous example. We will measure the time it takes to load the same 7.2GB text file repeatedly until the table size grows to about 1TB. We create a baseline with a table that has no indexes built on it. Then repeat with a B-tree index on the l_shipdate DATE column, and again after replacing the B-tree index with a BRIN index.

axle-growth-test

Our baseline shows that as the table grows the time it takes to insert data also increases. The difference in the time that it takes to insert data when the table is near 1TB compared to when it is empty is about 12 seconds. With the B-tree index in place the difference increases to 84 seconds. Finally the change is only about 15 seconds with the BRIN index in place.

So over a 1TB growth, the overheard on inserting data into the lineitem table due to just the size of the table increases about 4.3%. B-trees increase that difference to 12.2%. While the BRIN index continues to look encouraging by only increasing the overhead to 4.2%.

The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n°318633 – the AXLE project –http://www.axleproject.eu

Posted by Greg Sabino Mullane in EndPoint on 2014-10-09 at 15:44:00

Manny Calavera (animated by Lua!)
Image by Kitt Walker

Someone on the #postgresql IRC channel was asking how to make a copy of a schema; presented here are a few solutions and some wrinkles I found along the way. The goal is to create a new schema based on an existing one, in which everything is an exact copy. For all of the examples, 'alpha' is the existing, data-filled schema, and 'beta' is the newly created one. It should be noted that creating a copy of an entire database (with all of its schemas) is very easy: CREATE DATABASE betadb TEMPLATE alphadb;

The first approach for copying a schema is the "clone_schema" plpgsql function written by Emanuel Calvo. Go check it out, it's short. Basically, it gets a list of tables from the information_schema and then runs CREATE TABLE statements of the format CREATE TABLE beta.foo (LIKE alpha.foo INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING DEFAULTS). This is a pretty good approach, but it does leave out many types of objects, such as functions, domains, FDWs, etc. as well as having a minor sequence problem. It's also slow to copy the data, as it creates all of the indexes before populating the table via INSERT.

My preferred approach for things like this is to use the venerable pg_dump program, as it is in the PostgreSQL 'core' and its purpose in life is to smartly interrogate the system catalogs to produce DDL commands. Yes, parsing the output of pg_dump can get a little hairy, but that's always preferred to trying to create DDL yourself by parsing system catalogs. My quick solution follows.

pg_dump -n alpha | sed '1,/with_oids/ {s/ alpha/ beta/}' | psql

Sure, it's a bit of a hack in that it expects a specific string ("with_oids") to exist at the top of the dump file, but it is quick to write and fast to run; pg_dump creates the tables, copies the data over, and then adds in indexes, triggers, and constraints. (For an explanation of the sed portion, visit this post). So this solution works very well. Or does it? When playing with this, I found that there is on

[continue reading]