PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.

Upgrading Postgres is not quite as painful as it used to be, thanks primarily to the pg_upgrade program, but there are times when it simply cannot be used. We recently had an existing End Point client come to us requesting help upgrading from their current Postgres database (version 9.2) to the latest version (9.6 - but soon to be 10). They also wanted to finally move away from their SQL_ASCII encoding to UTF-8. As this meant that pg_upgrade could not be used, we also took the opportunity to enable checksums as well (this change cannot be done via pg_upgrade). Finally, they were moving their database server to new hardware. There were many lessons learned and bumps along the way for this migration, but for this post I'd like to focus on one of the most vexing problems, the database encoding.

When a Postgres database is created, it is set to a specific encoding. The most common one (and the default) is "UTF8". This covers 99% of all user's needs. The second most common one is the poorly-named "SQL_ASCII" encoding, which should be named "DANGER_DO_NOT_USE_THIS_ENCODING", because it causes nothing but trouble. The SQL_ASCII encoding basically means no encoding at all, and simply stores any bytes you throw at it. This usually means the database ends up containing a whole mess of different encodings, creating a "byte soup" that will be difficult to sanitize by moving to a real encoding (i.e. UTF-8).

Many tools exist which convert text from one encoding to another. One of the most popular ones on Unix boxes is "iconv". Although this program works great if your source text is using one encoding, it fails when it encounters byte soup.

For this migration, we first did a pg_dump from the old database to a newly created UTF-8 test database, just to see which tables had encoding problems. Quite a few did - but not all of them! - so we wrote a script to import tables in parallel, with some filtering for the problem ones. As mentioned above, iconv was not particularly helpful: looki

[...]

Recently we have received a couple of PostgreSQL support calls, which were related to bad performance on various deployments. In many cases the reason for database slowness was the fact that people assume that PostgreSQL automatically deploys an index on BOTH sides of the foreign keys relation, which is not the case. By the way: […]

The post PostgreSQL: Indexes and foreign keys appeared first on Cybertec - The PostgreSQL Database Company.

I've posted a snippet query to the PostgreSQL Wiki that "summarizes the keyspace" of a target B-Tree index. This means that it displays which range of indexed values belong on each page, starting from the root. It requires pageinspect. The query recursively performs a breadth-first search. Along the way, it also displays information about the space utilization of each page, and the number of distinct key values that actually exist on the page, allowing you to get a sense of how densely filled each page is relative to what might be expected.

The query is available from:

https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index


If I use the query against the largest index that results from initializing a pgbench database at scale factor 10 (pgbench_accounts_pkey), the query takes about 3 seconds to execute on my laptop, and returns the following:

       
level | l_item | blkno | btpo_flags | type | live_items | dead_items | avg_item_size | page_size | free_size | distinct_real_item_keys | highkey | distinct_block_pointers
-------+--------+-------+------------+------+------------+------------+---------------+-----------+-----------+-------------------------+---------+-------------------------
2 | 1 | 290 | 2 | r | 10 | 0 | 15 | 8192 | 7956 | 10 | | 10
1 | 1 | 3 | 0 | i | 285 | 0 | 15 | 8192 | 2456 | 284 | 103945 | 284
1 | 2 | 289 | 0 | i | 285 | 0 | 15 | 8192 | 2456 | 284 | 207889 | 284
1 | 3 | 575 | 0 | i | 285 | 0 | 15 | 8192 | 2456 | 284 | 311833 | 284
1 | 4 | 860 | 0 | i | 285 | 0 | 15 | 8192 | 2456 |
[...]

Barman 2.2 introduces support for parallel copy, by improving performance of both backup and recovery operations in your PostgreSQL disaster recovery solution.

Barman is a piece of software that has been incrementally improved since its conception in 2011. Brick after brick, with just one goal in mind: foster a disaster recovery culture in PostgreSQL, by making the whole backup/recovery process easier and more standard.

Barman is full of interesting features that go beyond disaster recovery (consider the WAL hub facility implemented via barman_wal_restore). Just to name a few: support for rsync/SSH and streaming backup, support for both WAL archiving and streaming (including synchronous for zero data loss clusters), a monitoring facility, incremental backup and recovery, hook scripts, and so on.

However, when managing large databases, Barman suffered from being bound to one CPU for backup and recovery operations. Lately, this was perceived as Barman’s main weak spot by users, and we decided to fill the gap.

Version 2.2 introduces support for parallel backup and recovery when using the rsync copy method, allowing you to specify how many jobs you want to run concurrently.

We have added one global option, called parallel_jobs, that can be overridden at server level. For back compatibility, this option is set to 1 by default. This option controls parallelism for both the backup and the recover command.

[vaughan]
description =  "Backup of SRV database"
ssh_command = ssh postgres@vaughan
conninfo = host=vaughan user=barman dbname=postgres
backup_method = rsync
parallel_jobs = 4
; … more options here

In some cases though, users might want to change the default behaviour and decide how many jobs are requested for a backup or recovery operation. For this reason we have implemented the --jobs option (or -j) for both the backup and recover command.

If you want to spread your backup over 8 rsync processes, you can simply execute:

$ barman backup -j 8 vaughan

Likewise, for recovery:

$ barman recover -j 8 [...] vaughan [
[...]
Posted by David Fetter in OpenSCG on 2017-07-17 at 18:16
We started off by creating a logging infrastructure, then arranging for a single table to use it. Rather than repeat that work for each table, let's use a relatively although not completely new feature: `EVENT TRIGGER`. The idea here is that we fire a trigger on `CREATE TABLE` and see to it that the table is logged. We'll write the trigger first, even though in reality, we'd need to load the function it calls first. ```sql CREATE EVENT TRIGGER add_logger ON ddl_command_end WHEN tag IN ('create table') EXECUTE PROCEDURE add_logger(); COMMENT ON EVENT TRIGGER add_logger IS 'Ensure that each table which is not a log gets logged'; ``` The magic happens inside `add_logger()`, but it's magic we've already seen. First, we'll get the table's name and schema using `pg_event_trigger_ddl_commands()`, filtering out tables which are already log tables. The test here is crude and string-based, but we could easily go to schema-based ones. ```sql CREATE OR REPLACE FUNCTION add_logger() RETURNS event_trigger LANGUAGE plpgsql AS $$$$ DECLARE r RECORD; cmd TEXT; BEGIN SELECT p.*, c.relname as table_name INTO STRICT r FROM pg_catalog.pg_event_trigger_ddl_commands() p JOIN pg_catalog.pg_class c ON (p.objid = c.oid) WHERE p.object_type = 'table' AND c.relname !~ '_log$''; /* No recursing! */ IF NOT FOUND THEN RAISE NOTICE 'Skipping log table'; RETURN; END IF; ``` Next, we add the log table for the table in question: ```sql RAISE NOTICE 'Adding log table(s) for %.%', r.schema_name, r.table_name; cmd := format('CREATE TABLE IF NOT EXISTS %I PARTITION OF the_log FOR VALUES IN (%L) PARTITION BY LIST(table_name);', pg_catalog.concat_ws('_', r.schema_name, 'log'), r.schema_name ); EXECUTE cmd; cmd := format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %s FOR VALUES IN (%L);', pg_catalog.concat_ws('_', r.schema_name, r.table_name, 'log'), pg_catalog.concat_ws('_', r.schema_name, 'log'), r.table_name ); EXECUTE cmd; ``` Finally, the triggers that populate it. ```sql cmd := format( $q$CREATE TRIGGER %I AFTER INSERT ON %I.%I REFERENCING NEW TABL[...]
Posted by Magnus Hagander in Redpill Linpro on 2017-07-17 at 11:02

When you create a table in PostgreSQL, it gets assigned default permissions and a default owner. We can alter the default privileges using the very useful ALTER DEFAULT PRIVILEGES command (a PostgreSQL extension to the standard). However, there isn't much we can do about the owner, which will get set to the role that is currently active. That is, it's the main login role, or another role if the user has run the SET ROLE command before creating the table.

A fairly common scenario that is not well handled here is when a number of end-users are expected to cooperate on the tables in a schema all the way, including being able to create and drop them. And this is a scenario that is not very well handled by the built-in role support, due to the ownership handling. Fortunately, this is something where we can once again use an event trigger to make the system do what we need.

Posted by Craig Kerstiens in CitusData on 2017-07-16 at 15:28

As a developer your CLI is your home. You spend a lifetime of person-years in your shell and even small optimizations can pay major dividends to your efficiency. For anyone that works with Postgres and likely the psql editor, you should consider investing some love and care into psql. A little known fact is that psql has a number of options you can configure it with, and these configuration options can all live within an rc file called psqlrc in your home directory. Here is my .psqlrc file, which I’ve customized to my liking. Let’s walk through some of the commands within my .psqlrc file:

\set QUIET 1
\pset null '¤'

\set PROMPT1 '%[%033[1m%][%/] # '
-- SELECT * FROM<enter>. %R shows what type of input it expects.
\set PROMPT2 '... > '

\timing

\x auto
\set VERBOSITY verbose
\set HISTFILE ~/.psql_history- :DBNAME
\set HISTCONTROL ignoredups
\set COMP_KEYWORD_CASE upper
\unset QUIET

First you see that we set QUIET 1, this makes it less noisy when we start up, we also unset QUIET at the end so it’s back to a standard psql shell later.

See all the nulls

Nulls are useful to be aware of within your database. For example null is not = to ‘’ and that might have some impact on your app. If you’re not experienced working with nulls, then here’s a great post on some of the quirks about Null. Regardless of how familiar you are in dealing with nulls, having them clearly displayed in your output can be helpful. To display nulls clearly, we use pset. With pset you can customize many things such as borders, you can tweak the footer, and more—and easily the most handy thing you can do is specify a character for null. You can put any character you like in here, so have fun and make your nulls super noticeable.

Know whether you’re connected to prod, staging, or dev

It’s incredibly common to connect to a number of different databases that have the same schema. At a minimum you have prod, staging, dev. No matter how much care you take in running a command against an environment, it’s always nice to have a signpost that tells yo

[...]
Posted by Pavel Stehule on 2017-07-16 at 11:57
I hope so every, who uses psql uses less pager as default pager. See important Merlin's article. less is great pager, but it is not designed for usage as pager of relational data, and miss some features. Mainly, there is not possible to freeze some rows and some columns. I didn't find any solution good enough for me, although there are some projects on github. The special pager for psql should to have some logic, to choose correctly and automatically frozen columns (in chars) and rows.

I wrote new pager pspg - Postgres pager. This pager is designed primary for usage as psql pager. Now, it is available for playing and testing. Probably there will be some unwanted artefacts - but for usual work it is good enough. It is first my ncurses applications, probably there will be some bugs.

This pager currently supports:
* possibility to set one from three themes from command line: -s 0 - black/white theme, -s 1 mc theme, -s 2 fox theme,
* possibility to set frozen columns from command line: -c N or interactively (0..4),
* first few rows with labels are frozen automatically,
* scrolling: left, rights, up, down, PgUp, PgDn, ^PgUp, ^PgDn,
* possibility to run demo from command line: -d,
* supports different psql border linestyles: ascii, unicode,
* supports different psql border levels: 0, 1, 2

Usage:
export PAGER="pspg" #default theme 1
psql dbname
\pset border 2 #best works with border 2, but others are supported too
or
export PAGER="pspg -s 0" #black white theme
psql dbname
..

  • black white theme -s 0
  • midnight commander theme -s 1
  • fox theme -s 2


I invite any cooperation with people with ncurses knowleadge. All patches are welcome.

For a long time, the Postgres query planner has sported a huge blinking neon blind-spot that frustrated and enraged DBAs throughout the universe to a level just shy of murderous frenzy. How is this even possible? What terrible lurking horror could elicit such a visceral response from probably the most boring and straight-laced people ever to draw breath? What else? Correlated statistics.

The Adventure Begins!

The Postgres query planner is a cost-estimation engine. Postgres gathers statistics on table contents such as most and least frequent values, value cardinality, rudimentary histograms, and so on. Along with multiple metrics related to hardware responsiveness, it generates multiple viable execution strategies, and chooses the one that probably costs the least resources. So far, this is something nearly all database engines do.

But there’s an area that query planners tend to skip: correlation. After all, if two columns are related in some manner, there’s nearly an infinite number of possible combinations; gathering statistics on all of them would be inadvisable to say the least. Unfortunately, doing so is critical in some cases.

To see why, let’s build a test case that helps illustrate the problem more directly. As usual, this means trotting out the old, reliable sensor_log table:

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       BIGINT NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL
);

INSERT INTO sensor_log (location, reading, reading_date)
SELECT s.id % 1000, id % 100,
       CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL
  FROM generate_series(1, 1000000) s(id);

CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

ANALYZE sensor_log;

So now we have 1,000,000 rows spread across 1,000 sensors, tracking up to 100 distinct values. In this particular scenario, the location and reading columns are strongly correlated thanks to our use of the modulo opera

[...]
Posted by Michael Goldberg in Postgres Miktzoanim on 2017-07-13 at 21:15

Introduction

I want to share the experience of using PostgreSQL and RabbitMQ. The matter is that recently we had a question of guaranteed, transactional delivery of messages from DB to the queue of RabbitMQ. In addition, it was required to do this as quickly as possible. I think you are aware of the fact that PostgreSQL has a remarkable opportunity for extensibility. And in particular, he has an extension that can work with RabbitMQ. It is called  pg_amqp. pg_amqp provides stored procedures in PostgreSQL for sending to amqp. The extension works fine at the application logic level: by rolling back the transaction to PostgreSQL – the data in amqp will not get caught. And if the commit – get.

Install the extension:

git clone https://github.com/omniti-labs/pg_amqp.git
cd pg_amqp
make
make install

Add this line to my postgresql.conf file:

shared_preload_libraries = 'pg_amqp.so'

And create extension:

CREATE EXTENSION amqp;

In the database was added the scheme “amqp”.  In the schema we have once table and four functions:

CREATE TABLE amqp.broker
(broker_id serial NOT NULL,
 host text NOT NULL,
 port integer NOT NULL DEFAULT 5672,
 vhost text,
 username text,
 password text,
 CONSTRAINT broker_pkey PRIMARY KEY (broker_id, host, port));

broker_id – serial number

host – RabbitMQ hosts

port – RabbitMQ port

vhost – virtual host

username – RabbitMQ user name

password – RabbitMQ user password

amqp.publish(broker_id integer, exchange character varying, routing_key character varying,
 message character varying, delivery_mode integer DEFAULT NULL::integer, content_type character varying DEFAULT NULL::character varying, reply_to character varying DEFAULT NULL::character varying,
 correlation_id character varying DEFAULT NULL::character varying)

amqp.autonomous_publish(broker_id integer, exchange character varying, routing_key character varying,
 message character varying, delivery_mode integer DEFAULT NULL::integer, content_type character varying DEFAULT NULL::character varying, reply_to character varying DEFAULT NULL::
[...]
Last time, we created a heirarchy of tables for logging, but we don't have anything they'd log, and we don't have a way to ensure the logging happens automagically.  This time, we'll fix those problems.

First, a thing to log:

CREATE TABLE foo(
    id BIGSERIAL PRIMARY KEY,
    t TEXT
);

Next, let's make an efficient trigger function.  You may have seen one of these before, but it was probably row-based, so it couldn't be super efficient.  This one is statement-based, and uses the new transition table feature in PostgreSQL 10.  We only need to build one trigger function because our logging infrastructure is schema-agnostic.  We could have separated it into three trigger functions, but that's just more code to maintain.  It starts off as usual:



CREATE OR REPLACE FUNCTION log()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
    IF TG_OP NOT IN ('INSERT', 'UPDATE', 'DELETE') THEN
        RAISE EXCEPTION 'This function should not fire on %', TG_OP;
    END IF;

    IF TG_OP = 'INSERT' THEN
        INSERT INTO the_log (
            action, table_schema,    table_name, new_row
        )
        SELECT
            TG_OP,  TG_TABLE_SCHEMA, TG_RELNAME, row_to_json(new_table)::jsonb
        FROM
            new_table;

While the rest of it should be familiar, new_table is new.  It's a transition table, which contains the rows as they looked after the action, which is an INSERT.  We've picked an arbitrary name for it, and we'll see how to do that below.  One operation done, two to go.  This next looks pretty similar:

    ELSIF TG_OP = 'DELETE' THEN
        INSERT INTO the_log (
            action, table_schema,    table_name, old_row
        )
        SELECT
            TG_OP,  TG_TABLE_SCHEMA, TG_RELNAME, row_to_json(old_table)::jsonb
        FROM
            old_table;



The other transition table, old_table, comes into play here.  For UPDATE, we'll use both:

    ELSE
        /*
         *  DANGER, WILL ROBINSON!  DANGER!
         *  This implementation assumes based on current implementation details
         *  that
[...]
It's been far too long since I wrote a blog post (again!), so in a bid to start fixing that, here's an easy one:

pgAdmin 4 v1.6 released

The pgAdmin Development Team are pleased to announce the release of pgAdmin 4 version 1.6. This release of pgAdmin 4 includes over 70 bug fixes and a dozen new features. For details, please see the release notes.
Notable changes in this release include:

  • Significant performance improvements on Windows, massively reducing initial load time and improving UI response for the vast majority of users during testing.
  • Enhancements to the Query Tool enabling the viewing of large query resultsets far more quickly. For example, a simple test query with 96K rows rendered results within 1 second vs. 22 seconds in pgAdmin III during testing!
  • A major rewrite of the Query History tab allows browsing of queries executed in the query tool with full details including the entire query, in a much nicer user interface.
  • The Desktop Runtime now supports detachable tabs, allowing the Query Tool and Debugger to be opened in new tabs and then moved to alternate displays (from 1.5 this was possible in web mode only)
  • The Query Tool's Results Grid has been overhauled with a new, sleek look an feel supporting selection of arbitrary rows, columns or blocks of cells with full copy support and column sizing retention.
  • The Dashboard tab can now be closed if desired, to minimise query traffic resulting from graph updates.
For more information, checkout the online documentation, the screenshots, and of course the download page.

Collaboration

For those that aren't aware, pgAdmin 4 is the fourth complete rewrite of pgAdmin since the first code was written way back in 1998(!). One of the major aims of the technology change we made for the new version was to help attract new developers as we always found them extremely hard to find when we used C++ (we now use Python and Javascript). To kickstart the project, EnterpriseDB (my employer) allowed me to make use of one of our development teams and we embarked
[...]

I’ve always enjoyed the Percona Live Europe events, because I consider them to be a lot more intimiate than the event in Santa Clara. It started in London, had a smashing success last year in Amsterdam (conference sold out), and by design the travelling conference is now in Dublin from September 25-27 2017.

So what are you waiting for when it comes to submitting to Percona Live Europe Dublin 2017? Call for presentations close on July 17 2017, the conference has a pretty diverse topic structure (MySQL [and its diverse ecosystem including MariaDB Server naturally], MongoDB and other open source databases including PostgreSQL, time series stores, and more).

And I think we also have a pretty diverse conference committee in terms of expertise. You can also register now. Early bird registration ends August 8 2017.

I look forward to seeing you in Dublin, so we can share a pint of Guinness. Sláinte.

In the example below, we are setting up LDAP with Active Directory (AD), one of the more popular server implementations, using the BigSQL PostgreSQL 96 distribution. Available for Windows and Linux.

Download BigSQL Distribution

Linux:

python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)" 

Windows:

@powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"

Find the bigsql directory and install PostgreSQL. Windows users don’t prefix the pgc command with ./ as shown in the following commands:

cd bigsql
./pgc install pg96
./pgc start pg96

Set your environment variables:

cd <directory of installation>/pg96

Linux:

source pg96.env

Windows:

pg96-env.bat

Configure pg_hba.conf

From the PostgreSQL documentaton for pg_hba.conf:

Client authentication is controlled by the pg_hba.conf (HBA stands for host-based authentication.)

The general format of the pg_hba.conf file is a set of records, one per line. A record is made up of a number of fields which are separated by spaces and/or tabs. The fields in the each record specifies a connection type, a client IP address range (if relevant for the connection type), a database name, a user name, and the authentication method to be used for connections matching these parameters.

Modify the LDAP connection record

There are many authentication methods in postgres (md5, trust, peer, etc). In this exercise, you will modify pg_hba.conf to use LDAP as the authentication method.

Navigate to the pg_hba.conf

cd bigsql/data/pg96/

Use your favorite editor to modify the line:

TYPE DATABASE USER CIDR-ADDRESS METHOD
host all all 127.0.0.1/32 trust

When we finish our exercise, the line will look something like this:

TYPE DATABASE USER CIDR-ADDRESS METHOD
host all all 127.0.0.1/32 ldap ldapserver=192.0.2.0 ldapprefix=”” ldapsuffix=”@oscg.com”

Modifying the METHOD field

The METHOD field provides connection credentials for Active Directory. Th

[...]

Postgres-BDR is an open source project from 2ndQuadrant that provides multi-master features for PostgreSQL. We have pursued a joint strategy of providing both working code available now and also submitting the features into core PostgreSQL.

Postgres-BDR 1.0 runs on a variant distro of PG9.4. This is in Production now and receives regular maintenance and security updates. 2ndQuadrant will support this until 9.4 End of Life in December 2019.

One of the greatest achievements to come out of our work on BDR is the logical replication technology. Our engineers spent a considerable amount of energy to contribute the tech to PostgreSQL core and I feel especially proud that this is a headline feature of the upcoming PG10 release.

And Now BDR 2.0 … 

BDR 2.0 runs on community PG9.6 as an extension. 2ndQuadrant is making this available now to Support customers only. BDR 2.0 allows our customers to upgrade to PostgreSQL 9.6 while we work on some new BDR features to be made available in version 3.0, which will be released as open source.

BDR 3.0 will be available by Q1 2018, maybe earlier. We will be working with our customers to prioritize feature content and it already looks exciting. We expect BDR3 to be the main future development path and will be supported from its release for 5 years.

We will be submitting further enhancements of performance and robustness to logical replication into PG11. These are all important infrastructure features that will pave the path for inclusion of multi-master replication in PostgreSQL core.

Multi-master functionality from BDR 3.0 will be submitted as core patches to PG12 in 2018, allowing us to work hard towards getting it full function multi-master into core for Production use by Sept 2019 onwards. At that point the project will have taken us 7 years to complete integration.

I would like to share the slides for a talk I gave last week at PgDay 2017:

PgDay is an annual conference hosted by DataEgret in Saint Petersburg. The conference is dedicated not only to PostgreSQL, but also MySQL, NoSQL solutions, commertical DBMSes, etc.

The first ever PGConf US Local: Seattle event is happening in partnership with SEAPUG on August 11th and 12th at the Sheraton Downtown Seattle! On August 11th we have four training options available:
  1. Mastering PostgreSQL Administration by Bruce Momjian
  2. Migrating from Oracle to PostgreSQL by Jim Mlodgenski
  3. Postgres Performance and Maintenance by Joshua (JD) Drake
  4. Database Automation by Robert Bernier 

Tickets are now available!


On August 12th we have 14 break-out sessions between the Development and Operations tracks. A sample of our break-out sessions is available below but please check the full schedule for all of the fantastic content!
We would not be able to produce PGConf Local: Seattle 2017 without the generous support from our sponsors:
    Sponsorship opportunities for the conference are still available. If you use Postgres, support Postgres, or if you are looking to recruit a local Postgres expert, this is a great opportunity. Become a part of our dynamic and growing ecosystem! The prospectus is available at:
    Posted by Bruce Momjian in EnterpriseDB on 2017-07-11 at 12:30

    Hstore was added to Postgres in 8.2 (2006). I didn't appreciate the purpose of hstore when it was first introduced. I knew it was a per-field key/value store, but not much else. People would come up to me at conferences and say "Hstore is the best," but I could only reply with "Yeah." Only later did someone explain to me the usefulness of being able to store unstructured data in a database.

    Of course, with the popularity of NoSQL, everyone now understands the value of storing unstructured data. They probably also know that being able to store structured (relational) and unstructured data in the same database, like Postgres, is a big win.

    Unfortunately, hstore, the Postgres extension that started it, hasn't kept up. When JSONB was added in Postgres 9.4 (2014) it superseded the capabilities of hstore. JSONB is a key/value store, but allows for hierarchical storage too, and uses the popular JSON syntax. Some of the performance improvements made to the JSONB data type could not be backpatched to hstore, due to compatibility requirements.

    Continue Reading »

    Posted by Bruce Momjian in EnterpriseDB on 2017-07-10 at 15:30

    In conjunction with PGConf Local: Philly, I will be hosting a Postgres Party at my home near Philadelphia this Friday, July 14. You don't have to attend the conference to attend the party.

    Posted by Quinn Weaver in pgExperts on 2017-07-10 at 00:19

    Quite often I need to kick off a long-running process for a client, then resume work immediately once it's done. pg_restore, pg_basebackup, pg_upgrade, and vacuumdb --analyze-only all come to mind as examples. The ideal thing is to get a text message upon completion. When I'm working on my own boxes, I can just mail(1) my mobile carrier's text gateway, but I can't count on clients' servers having sendmail or the like set up (they usually don't).

    Enter TextBelt. This service is a dead-simple HTTP-to-SMS gateway. Adapted from their docs:


    curl -X POST https://textbelt.com/text \
    --data-urlencode phone="$MY_MOBILE_NUMBER" \
    --data-urlencode message='The process completed.' \
    -d key="$MY_TEXTBELT_API_KEY"

    Cleartext HTTP is also supported, in case the client box has broken SSL libraries. My texts are always nondescript, so I don't mind sending them in the clear.

    The whole thing is open-source, so you can set up your own TextBelt server. Or you can be lazy and throw a few dollars their way for a certain number of texts. I rather like this business model, actually, as a way to support open-source work.

    Posted by Quinn Weaver in pgExperts on 2017-07-07 at 22:47

    Short version

    Postgres Plus keeps psql out /usr/bin, so you need to set PATH in your cron jobs (including for WAL-E).

    Longer version

    Like all good people, I set up a cron jobs to run nightly WAL-E base backups. With one client, this failed the first time:

    wal_e.main ERROR MSG: could not run one or more external programs WAL-E depends upon
    DETAIL: Could not run the following programs, are they installed? psql
    STRUCTURED: time=2017-07-07T03:00:01.957961-00 pid=25833
    Turns out they were using Postgres Plus, and it puts psql in /opt/PostgresPlus/9.3AS/bin. That directory is in the enterprisedb user's PATH, of course, but not in the minimal PATH that cron jobs get by default. So I had to log in as enterprisedb, echo $PATH, and then paste PATH = [what echo said] at the top of my cron job. Moral of the story: take care when setting up cron jobs for PostgreSQL forks, or for non-standard community PostgreSQL installations.
    Posted by Mark Wong on 2017-07-07 at 18:26

    When: 6-8pm Thursday July 20, 2017
    Where: iovation
    Who: Josh Berkus
    What: PostgreSQL 10 Beta

     

    PostgreSQL 10 is now out in beta. We called it 10 because this release is going to blow your socks off. Josh Berkus will demonstrate several of the new features in this version, including built-in logical replication, native partitioning, additional query parallelism, JSON full-text search, and more. He’ll also explain some of the issues you’re likely to encounter upgrading to Version X.

    Josh Berkus recently retired from the PostgreSQL Core Team and is currently working for Redhat as the Project Atomic Community Lead.


    If you have a job posting or event you would like me to announce at the meeting, please send it along. The deadline for inclusion is 5pm the day before the meeting.

    Our meeting will be held at iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

    iovation provides us a light dinner (usually sandwiches or pizza).

    Elevators open at 5:45 and building security closes access to the floor at 6:30.


    Posted by pgCMH - Columbus, OH on 2017-07-07 at 04:00

    I know you were all eagerly awaiting it.. so I’ve set up an Instagram account for the group. You can follow us at https://www.instagram.com/pgcmh. Right now, we’ve got our very first photos up of @PhilVacca delivering his presentation at the June meeting.

    This will be a low-volume account. I can’t see a huge need for tons of photos but I will post pictures of guest speakers and perhaps some PostgreSQL-related photos when I attend some of the larger community events.

    Enjoy.

    The July meeting will be held at 18:00 EST on Tues, the 25th. Once again, we will be holding the meeting in the community space at CoverMyMeds. Please RSVP on MeetUp so we have an idea on the amount of food needed.

    Topic

    Jim Mlodgenski will be flying in from NJ to deliver his ‘Top 10 Mistakes When Migrating From Oracle to PostgreSQL’.

    Jim Mlodgenski is the CEO and co-founder of StormDB and one of the co-organizers of the New York City PostgreSQL User Group. Jim has been active in the PostgreSQL community for many years and speaks on PostgreSQL topics at conferences worldwide, including co-organizing PGDay NYC.

    Prior to StormDB, Jim was Founder of Cirrus Technologies, a professional services company focused on helping move database centric applications to the Cloud. Before that, Jim was Chief Architect at EnterpriseDB. While at EnterpriseDB, in addition to the role of Chief Architect, he also was VP of Technical Services running Sales Engineering, Professional Services, Training and Customer Support.

    Parking

    Please park at a meter on the street or in the parking garage (see below). You can safely ignore any sign saying to not park in the garage as long as it’s after 17:30 when you arrive. Park on the first level in any space that is not marked ‘24 hour reserved’. Once parked, take the elevator to the 3rd floor to reach the Miranova lobby.

    Finding us

    You do not need to check-in at the desk. In fact, they’ll probably tell you they don’t know of a meeting. Just skip the front desk completely.

    The elevator bank is in the back (West side) of the building. Take a left and walk down the hall until you see the elevator bank on your right. Grab an elevator up to the 11th floor. Once you exit the elevator, look to your left and right. One side will have visible cubicles, the other won’t. Head to the side without cubicles. You’re now in the community space. The kitchen is to your right (grab yourself a drink) and the meeting will be held to your left. Walk down the room towards the stage.

    If you have any issues or questions

    [...]
    If you've dealt with PostgreSQL's answer to partitioning in the past, you may have put a lot of work into it.  Happily, you have a lot less to bother with as of PostgreSQL 10, which is not out yet as of this writing.

    Let's dive right in with a table that's a little like an audit trail.

    CREATE TABLE IF NOT EXISTS the_log (
        "timestamp" timestamp with time zone DEFAULT now() NOT NULL,
        "user" text NOT NULL DEFAULT CURRENT_USER,
        action text NOT NULL,
        table_schema text NOT NULL,
        table_name text NOT NULL,
        old_row jsonb,
        new_row jsonb,
        CONSTRAINT the_log_check CHECK (
            (old_row IS NOT NULL)::integer +
            (new_row IS NOT NULL)::integer > 0

        )
    ) PARTITION BY LIST(table_schema);


    COMMENT ON CONSTRAINT the_log_check ON the_log IS
    'Make sure at least one of the columns is NOT NULL';
     


    A few things are going on here which are likely familiar to you if you're a PostgreSQL user:

    • CREATE ... IF NOT EXISTS, which lets you move toward those single-transaction deployments that have made PostgreSQL justly famous
    • Smart defaults on columns
    • JSONB because you'll probably want to probe inside a structure that might change over time
    • A CHECK constraint that ensures at least one not-NULL among the columns it looks at.  This can be generalized in a few different directions.  (This one's not actually all that common, but it really should be.)
    One thing is really new: PARTITION.  In this case, we're partitioning the overarching table by one or more values in the table_schema column.  We're using LIST rather than RANGE because while schemas can definitely be sorted, it's probably not something important that you might use in a query.  LIST partitions are meant to highlight the uniqueness of (a few) individuals.

    A lot of hard work goes on in order to make this work, but now it's been amortized into the PostgreSQL software, where before it was externalized to you and me.




    Since tables with a PARTITION clause can't actually store tuples, let's make a place to store them.

    CREATE TABLE IF NOT E[...]
    Posted by Dimitri Fontaine on 2017-07-06 at 15:30

    Today pgloader v3.4.1 is released and available! This new release comes with 110 commits as show in github compare view.

    This release of pgloader is following the tradition of simplifying things for users, or if you allow me to quote Alan Key, I believe that if simple things should be simple, complex things should be possible.

    Posted by Bruce Momjian in EnterpriseDB on 2017-07-05 at 14:15

    Now that everyone is using IPv6 () it might be time to start playing with it. Postgres has had full IPv6 support for years, so Postgres is a good place to start, particularly with IPv6-aware data types.

    Since IPv6 addresses are 128-bits instead of IPv4's 32-bits, they can be quite long, e.g. 2001:0db8:85a3:0000:0000:8a2e:0370:7334. As you can see, it is made up of eight quad-hex segments, separated by colons. To shorten the text representation, leading zeros in any quad-hex segment can be removed, though an all-zero quad still requires a zero. In addition, the longest string of all-zero quads can be abbreviated with double colons. This can be illustrated in Postgres:

    SELECT '2001:0db8:85a3:0000:0000:8a2e:0000:7334'::inet;
                 inet
    ------------------------------
     2001:db8:85a3::8a2e:0:7334
    

    Continue Reading »

    We expect Postgres-XL 10 to be released in the next few months, with the new features of partitioning and logical replication. You’ll be able to load Postgres-XL directly from PostgreSQL.

    For the first time, you’ll be able to run massively parallel queries both across the datanodes and within the datanodes to give huge performance gains.

    Earlier today, we updated the master branch of the Postgres-XL repository to include all commits from PostgreSQL’s master branch up to June 26. That means the XL project is now fully to date with the PostgreSQL source code, meaning there is now only minimal lag between PostgreSQL and Postgres-XL.

    At this point the code is only in Development/Alpha, though we expect to produce Beta1 soon after PostgreSQL Beta 3 in August.

    Support from the Postgres-XL community is critical to achieve these goals in a time bound manner. So we would urge everyone to contribute in whatever way you can to advance Postgres-XL 10. Some of the things that you can contribute are:

    1. Run regression tests, analyse the failure and submit patches to either fix bugs or adjust expected output.
    2. If you’re paying attention to PostgreSQL 9.6 and PostgreSQL 10, check if the new features work in Postgres-XL or not. Submit reports, analysis, patches for things that don’t work. Or simply confirm things that work so that we know what’s not broken. Specially if there is a feature that you’re looking forward to, make sure you start testing now.

     

    Stay tuned for further updates on Postgres-XL. Subscribe to our mailing lists and most important, do contribute to the project to make it better.

    Posted by Pavel Stehule on 2017-07-05 at 09:45
    I wrote two small applications pgimportdoc and pgexportdoc. Now these applications are available from community repository (Thanks to Devrim GÜNDÜZ).

    These applications should be used for import/export long documents (text, json, jsonb, xml, bytea) to/from Postgres.

    [pavel@localhost ]$ ./pgimportdoc postgres -f ~/Stažené/enprimeur.xml -c 'insert into xmldata values($1)' -t XML
    [pavel@localhost ]$ cat ~/Stažené/enprimeur.xml | ./pgimportdoc postgres -c 'insert into xmldata values($1)' -t XML
    [pavel@localhost ]$ cat ~/Stažené/enprimeur.xml | ./pgimportdoc postgres -E latin2 -c 'insert into doc values($1) returning id' -t TEXT

    [pavel@localhost ]$ pgexportdoc -c 'select x from xmldata where id = 1' -t XML -f myxmldoc.xml

    This weekend we spent sometime moving PostGIS/pgRouting windows buildbot Winnie to new hardware. Leo did the hardware and I handled installing and reconfiguring stuff. While I was at it, I upgraded to new Jenkins. Vicky Vergara has been bugging me to setup pgTap so she can run her pgRouting pgTap tests to make sure they work on windows. She's got 22488 tests. She just loves pgTap. Last time I tried installing pgTap I gave up, but I was in mood for experimentation so gave it another chance.


    Continue reading "Installing pgTap in windows with msys2 and mingw64"