PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Twitter
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Chris Travers on 2014-09-16 at 03:21:00

Preface


I have decided to put together a PGObject Cookbook, showing the power of this framework.  If anyone is interested in porting the db-looking sides to other languages, please let me know.  I would be glad to provide whatever help my time and skills allow.

The PGObject framework is a framework for integrated intelligent PostgreSQL databases into Perl applications.  It addresses some of the same problems as ORMs but does so in a very different way.  Some modules are almost ORM-like and more such modules are likely to be added in the future.  However unlike an ORM, PGObject mostly serves as an interface to stored procedures and whatever code generation routines will be added, these are not intended to be quickly changed.  Moreover it only supports PostgreSQL because we make extended use of PostgreSQL-only features.

For those who are clearly not interested in Perl, this series may still be interesting as it not only covers how to use the framework but also various problems that happen when we integrate databases with applications.  And there are people who should not use this framework because it is not the right tool for the job.  For example, if you are writing an application that must support many different database systems, you probably will get more out of an ORM than you will this framework.  But you still may get some interesting stuff from this series so feel free to enjoy it.

Along the way this will explore a lot of common problems that happen when writing database-centric applications and how these can be solved using the PGObject framework.  Other solutions of course exist and hopefully we can talk about these in the comments.

Much of the content here (outside of the prefaces) will go into a documentation module on CPAN.  However I expect it to also be of far more general interest since the problems are common problems across frameworks.

Introduction


PGObject is written under the theory that the database will be built as a server of information and only loosely tied to the application.  Theref

[continue reading]

Posted by gabrielle roth on 2014-09-16 at 00:49:01
Last weekend we held the biggest PDXPUGDay we’ve had in a while! 5 speakers + a few lightning talks added up to a fun lineup. About 1/3 of the ~50 attendees were in town for FOSS4G; I think the guy from New Zealand will be holding the “visitor farthest from PDXPUG” for a good long […]
Maybe...

I have yet to run PostgreSQL on GCE in production. I am still testing it but I have learned the following:

  1. A standard provision disk for GCE will give you ~ 80MB/s random write.
  2. A standard SSD provisioned disk for GCE will give you ~ 240MB/s.

Either disk can be provisioned as a raw device allowing you to use Linux Software Raid to build a RAID 10 which even further increases speed and reliability. Think about that, 4 SSD provisioned disks in a RAID 10...

The downside I see outside of the general arguments against cloud services (shared tenancy, all your data in a big brother, lack of control over your resources, general distaste for $vendor, or whatever else we in our right minds can think up) is that GCE is current limited to 16 virtual CPUS and 104GB of memory.

What does that mean? Well it means that it is likely that GCE is perfect for 99% of PostgreSQL workloads. By far the majority of PostgreSQL need less than 104GB of memory. Granted, we have customers that have 256GB, 512GB and even more but those are few and far between.

It also means that EC2 is no longer your only choice for dynamic cloud provisioned VMs for PostgreSQL. Give it a shot, the more competition in this space the better.

Posted by US PostgreSQL Association on 2014-09-15 at 15:44:05

It has been a little quiet on the U.S. front of late. Alas, summer of 2014 has come and gone and it is time to strap on the gators and get a little muddy. Although we have been relatively quiet we have been doing some work. In 2013 the board appointed two new board members, Jonathan S. Katz and Jim Mlodgeski. We also affiliated with multiple PostgreSQL User Groups:

  • NYCPUG
  • PhillyPUG
  • SeaPUG
  • PDXPUG

read more

Thanks for Bucardo team for responding my previous post. My cascaded slave replication works as expected.

Today I notice there is still something to do related with delta and track tables.
Single table replication scenario:
Db-A/Tbl-T1 (master) => Db-B/Tbl-T2 (slave) => Db-C/Tbl-T3 (cascaded slave)

Every change on Table T1 replicated to T2, then T2 to T3. After a while, VAC successfully cleans delta and track tables on Db-A. But not on Db-B.

I detect 2 issues:
1. If cascaded replication T2 to T3 successful, the delta table on Db-B is not be cleaned up by VAC.
2. If cascaded replication T2 to T3 failed before VAC schedule, the delta table on Db-B will be cleaned up by VAC. Then, cascaded replication from T2 to T3 losts.

I fix it by modifying SQL inside bucardo.bucardo_purge_delta(text, text):

— Delete all txntimes from the delta table that:
— 1) Have been used by all dbgroups listed in bucardo_delta_targets
— 2) Have a matching txntime from the track table
— 3) Are older than the first argument interval
myst = 'DELETE FROM bucardo.'
|| deltatable
|| ' USING (SELECT track.txntime AS tt FROM bucardo.'
|| tracktable
|| ' track INNER JOIN bucardo.bucardo_delta_targets bdt ON track.target=bdt.target'
|| ' GROUP BY 1 HAVING COUNT(*) = '
|| drows
|| ') AS foo'
|| ' WHERE txntime = tt'
|| ' AND txntime < now() – interval '
|| quote_literal($1);

Need advice from Bucardo team.


Posted by Chris Travers on 2014-09-14 at 23:36:00
15 September 2014, London. The LedgerSMB project - all-volunteer developers and contributors - today announced LedgerSMB 1.4.0.

Based on an open source code base first released in 1999, the LedgerSMB project was formed in 2006 and saw it's 1.0 release in the same year. It has now seen continuous development for over eight years and that shows no signs of slowing down.

"LedgerSMB 1.4 brings major improvements that many businesses need," said Chris Travers, who helped found the project. "Businesses which do manufacturing or retail, or need features like funds accounting will certainly get much more out of this new release."

Better Productivity


LedgerSMB 1.4 features a redesigned contact management framework that allows businesses to better keep track of customers, vendors, employers, sales leads, and more. Contacts can be stored and categorized, and leads can be converted into sales accounts.

Additionally, a new import module has been included that allows businesses to upload csv text files to import financial transactions and much more. No longer is data entry something that needs to be done entirely by hand or involves customizing the software.

Many smaller enhancements are here as well, For example, shipping labels can now be printed for invoices and orders, user management workflows have been improved,

Better Reporting


The reporting interfaces have been rewritten in LedgerSMB 1.4.0 in order to provide greater flexibility in both reporting and in sharing reports. Almost all reports now include a variety of formatting options including PDF and CSV formats. Reports can also be easily shared within an organization using stable hyperlinks to reports. Additionally the inclusion of a reporting engine means that it is now relatively simple to write third-party reports which offer all these features. Such reports can easily integrate with LedgerSMB or be accessed via a third party web page.

Additionally, the new reporting units system provides a great deal more flexibility in tracking money and resources as t

[continue reading]

On 9th of September, Tom Lane committed patch: Add width_bucket(anyelement, anyarray).   This provides a convenient method of classifying input values into buckets that are not necessarily equal-width. It works on any sortable data type.   The choice of function name is a bit debatable, perhaps, but showing that there's a relationship to the SQL […]

Postgres 9.5 will come up with an additional logging option making possible to log replication commands that are being received by a node. It has been introduced by this commit.

commit: 4ad2a548050fdde07fed93e6c60a4d0a7eba0622
author: Fujii Masao <fujii@postgresql.org>
date: Sat, 13 Sep 2014 02:55:45 +0900
Add GUC to enable logging of replication commands.

Previously replication commands like IDENTIFY_COMMAND were not logged
even when log_statements is set to all. Some users who want to audit
all types of statements were not satisfied with this situation. To
address the problem, this commit adds new GUC log_replication_commands.
If it's enabled, all replication commands are logged in the server log.

There are many ways to allow us to enable that logging. For example,
we can extend log_statement so that replication commands are logged
when it's set to all. But per discussion in the community, we reached
the consensus to add separate GUC for that.

Reviewed by Ian Barwick, Robert Haas and Heikki Linnakangas.

The new parameter is called log_replication_commands and needs to be set in postgresql.conf. Default is off to not log this new information that may surprise existing users after an upgrade to 9.5 and newer versions. And actually replication commands received by a node were already logged at DEBUG1 level by the server. A last thing to note is that if log_replication_commands is enabled, all the commands will be printed as LOG and not as DEBUG1, which is kept for backward-compatibility purposes.

Now, a server enabling this logging mode...

$ psql -At -c 'show log_replication_commands'
on

... Is able to show replication commands in LOG mode. Here is for example the set of commands set by a standby starting up:

LOG:  received replication command: IDENTIFY_SYSTEM
LOG:  received replication command: START_REPLICATION 0/3000000 TIMELINE 1

This will certainly help utilities and users running audit for replication, so looking forward to see log parsing tools like pgbadger make some nice outputs using this

[continue reading]

Posted by Barry Jones on 2014-09-13 at 04:12:15
Here's the video from the August UpstatePHP meeting in Greenville discussing SQL vs NoSQL and where they are useful for your development process. I represented SQL solutions (*cough* PostgreSQL *cough*) while Benjamin Young represented NoSQL. Ben has actively contributed to CouchDB, worked for Cloudant, Couchbase, organizes the REST Fest Unconference (happening again September 25-27th) and is t...

A while ago I wrote about compiling PostgreSQL extensions under Visual Studio – without having to recompile the whole PostgreSQL source tree.

I just finished the pg_sysdatetime extension, which is mainly for Windows but also supports compilation with PGXS on *nix. It’s small enough that it serves as a useful example of how to support Windows compilation in your extension, so it’s something I think is worth sharing with the community.

The actual Visual Studio project creation process took about twenty minutes, and would’ve taken less if I wasn’t working remotely over Remote Desktop on an AWS EC2 instance. Most of the time was taken by the simple but fiddly and annoying process of adding the include paths and library path for the x86 and x64 configurations. That’s necessary because MSVC can’t just get them from pg_config and doesn’t have seem to have user-defined project variables to let you specify a $(PGINSTALLDIR) in one place.

Working on Windows isn’t always fun – but it’s not as hard as it’s often made out to be either. If you maintain an extension but haven’t added Windows support it might be easier than you expect to do so.

Packaging it for x86 and x64 versions of each major PostgreSQL release, on the other hand… well, lets just say we could still use PGXS support for Windows with a “make installer” target.

Posted by gabrielle roth on 2014-09-11 at 20:55:43

We had about 50 folks attend the PDXPUGDay 2014 last week, between DjangoCon and Foss4g. A lot of folks were already in town for one of the other confs, but several folks also day tripped from SeaPUG! Thanks for coming on down.

Thanks again to our speakers:
Josh Drake
David Wheeler
Eric Hanson
Veronika Megler
Kristin Tufte
Josh Berkus

(Plus our lightning talk speakers: Josh B, Mark W, and Basil!)

And our sponsors:
2nd Quadrant
iovation
PGX

And of course, PSU for hosting us.

Videos are linked from the wiki.


Posted by Leo Hsu and Regina Obe on 2014-09-11 at 20:37:00

If you weren't able to make it to FOSS4G 2014 this year, you can still experience the event Live. All the tracks are being televised live and its pretty good reception. https://2014.foss4g.org/live/. Lots of GIS users using PostGIS and PostgreSQL. People seem to love Node.JS too.

After hearing enough about Node.JS from all these people, and this guy (Bill Dollins), I decided to try this out for myself.

I created a node.js web application - which you can download from here: https://github.com/robe2/node_postgis_express . It's really a spin-off from my other viewers, but more raw. I borrowed the same ideas as Bill, but instead of having a native node Postgres driver, I went for the pure javascript one so its easier to install on all platforms. I also experimented with using base-64 encoding to embed raster output directly into the browser so I don't have to have that silly img src path reference thing to contend with.

A co-worker of mine did a blog post last year that I’ve found incredibly useful when assisting clients with getting shared_buffers tuned accurately.

Setting shared_buffers the hard way

You can follow his queries there for using pg_buffercache to find out how your shared_buffers are actually being used. But I had an incident recently that I thought would be interesting to share that shows how shared_buffers may not need to be set nearly as high as you believe it should. Or it can equally show you that you that you definitely need to increase it. Object names have been sanitized to protect the innocent.

To set the stage, the database total size is roughly 260GB and the use case is high data ingestion with some reporting done on just the most recent data at the time. shared_buffers is set to 8GB. The other thing to note is that this is the only database in the cluster. pg_buffercache is installed on a per database basis, so you’ll have to install it on each database in the cluster and do some additional totalling to figure out your optimal setting in the end.

database=# SELECT c.relname
  , pg_size_pretty(count(*) * 8192) as buffered
  , round(100.0 * count(*) / ( SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer,1) AS buffers_percent
  , round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation
 FROM pg_class c
 INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
 INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
 GROUP BY c.oid, c.relname
 ORDER BY 3 DESC
 LIMIT 10;
               relname               | buffered | buffers_percent | percent_of_relation
-------------------------------------+----------+-----------------+---------------------
 table1                              | 7479 MB  |            91.3 |                 9.3
 table2                              | 362 MB   |             4.4 |               100.0
 table3                              | 311 MB   |             3.8 |                 0.8
 table4                   

[continue reading]

Posted by Chris Travers on 2014-09-11 at 13:25:00
This will be the final installment on Math and SQL and will cover the problem with NULLs.  NULL handling is probably the most poorly thought-out feature of SQL and is inconsistent generally with the relational model.  Worse, a clear mathematical approach to NULLs is impossible with SQL because too many different meanings are attached to the same value.

Unfortunately, nulls are also indispensable because wider tables are more expressive than narrower tables.  This makes advice such as "don't allow nulls in your database" somewhat dangerous because one ends up having to add them back in fairly frequently.

At the same time understanding the problems that NULLs introduce is key to avoiding the worst of the problems and managing the rest.

Definition of a Null Set


A null set is simply a set with no members.  This brings us to the most obvious case of the use of a NULL, used when an outer join results in a row not being found.  This sort of use by itself doesn't do too much harm but the inherent semantic ambiguity of "what does that mean?" also means you can't just substitute join tables for nullable columns and solve the problems that NULLs bring into the database. This will hopefully become more clear below.

Null as Unknown


The first major problem surfaces when we ask the question, "when I do a left join and the row to the right is not found, does that mean we don't know the answer yet or that there is no value associated?"  In all cases, a missing result from an outer join will sometimes mean that the answer is not yet known, if only because we are still inserting the data in stages.  But it can also mean that maybe there is an answer and that there is no value associated.  In almost all databases, this may also be the case in this situation.

But then there is no additional harm done in allowing NULLs to represent unknowns in the tables themselves, right?

Handling NULLs as unknown values complicates database design and introduces problems so many experts like Chris Date tend to be generally against their use.

[continue reading]

After my Btree bloat estimation query, I found some time to work on a new query for tables. The goal here is still to have a better bloat estimation using dedicated queries for each kind of objects.

Compare to the well known bloat query, this query pay attention to:

  • TOAST
  • headers of variable length types
  • easier to filter or parse

You’ll find the queries here:

Tests

I created the file sql/bloat_tables.sql with the 9.0 and more query version. I edited the query to add the bloat reported by pgstattuple (free_percent + dead_tuple_percent) to compare both results and added the following filter:

-- remove Non Applicable tables
NOT is_na
-- remove tables with real bloat < 1 block
AND tblpages*((pst).free_percent + (pst).dead_tuple_percent)::float4/100 >= 1
-- filter on table name using the parameter :tblname
AND tblname LIKE :'tblname'

Here is the result on a fresh pagila database:

postgres@pagila=# \set tblname %
postgres@pagila=# \i sql/bloat_tables.sql 
 current_database | schemaname |    tblname     | real_size | bloat_size | tblpages | is_na |   bloat_ratio    | real_frag 
------------------+------------+----------------+-----------+------------+----------+-------+------------------+-----------
 pagila           | pg_catalog | pg_description |    253952 |       8192 |       31 | f     |  3.2258064516129 |      3.34
 pagila           | public     | city           |     40960 |       8192 |        5 | f     |               20 |     20.01
 pagila           | public     | customer       |     73728 |       8192 |        9 | f     | 11.1111111111111 |     11.47
 pagila           | public     | film           |    450560 |       8192 |       55 | f     | 1.81818181818182 |      3.26
 pagila           | public     | rental         |   1228800 |     131072 |      150 | f     

[continue reading]

Writing a complex database server like PostgreSQL is not an easy task. Especially memory management is an important task, which needs special attention. Internally PostgreSQL makes use of so called “memory contexts”. The idea of a memory context is to organize memory in groups, which are organized hierarchically. The main advantage is that in case […]
Posted by Paul Ramsey on 2014-09-10 at 00:00:00

The 2.1.4 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.4 release. As befits a patch release, the focus is on bugs, breakages, and performance issues

http://download.osgeo.org/postgis/source/postgis-2.1.4.tar.gz

Continue Reading by clicking title hyperlink ..

As mentioned in my earlier blog, I'm visiting several events in the US and Canada in October and November. The first of these, the talk about WebRTC in CRM at xTupleCon, has moved from the previously advertised timeslot to Wednesday, 15 October at 14:15.

WebRTC meeting, Norfolk, VA

Later that day, there will be a WebRTC/JavaScript meetup in Norfolk hosted at the offices of xTuple. It is not part of xTupleCon and free to attend. Please register using the Eventbrite page created by xTuple.

This will be a hands on event for developers and other IT professionals, especially those in web development, network administration and IP telephony. Please bring laptops and mobile devices with the latest versions of both Firefox and Chrome to experience WebRTC.

Free software developers at xTupleCon

If you do want to attend xTupleCon itself, please contact xTuple directly through this form for details about the promotional tickets for free software developers.

Posted by Andreas Scherbaum on 2014-09-09 at 18:38:35
Author
Andreas 'ads' Scherbaum

Google Summer of Code 2014 is wrapped up: Maxence Ahlouche did an excellent job implementing one new algorithm for MADlib and refactored the code base for another one.

I posted a more detailled explanation in the Pivotal blog.

Changelog

Here are some fresh news about my previous work (see part 1 and part 2) on a better Btree bloat query:

Known issue

While working on table bloat (I will blog about that very soon), I found a large deviation on statistics on array types. I’m not sure how to handle correctly these animals’ header yet.

Cheers and happy monitoring!

Posted by Joshua Drake in CommandPrompt on 2014-09-08 at 18:54:19
I had the honor of being asked to give the introduction at PDXPGDay 2014 this past Saturday. I didn't speak very long but it was great to see a lot of the old stomping ground. It had been quite some time since I had been in the group of Wheeler, Roth, Wong, Berkus and a few others.

The conference was really a mini-conference but it was great. It was held in the exact same room that PostgreSQL Conference West was held all the way back in 2007. It is hard to believe that was so long ago. I will say it was absolutely awesome that PDX still has the exact same vibe and presentation! (Read: I got to wear shorts and a t-shirt).

Some items of note: Somebody was peverse enough to write a FUSE driver for PostgreSQL and it was even bi-directional. This means that PostgreSQL gets mounted as a filesystem and you can even use Joe (or yes VIM) to edit values and it saves them back to the table.

Not nearly enough of the audience was aware of PGXN. This was a shock to me and illustrates a need for better documentation and visibility through .Org.

The success of this PgDay continues to illustrate that other PUGS should be looking at doing the same, perhaps annually!

Thanks again Gab and Mark for entrusting me with introducing your conference!

Posted by Andrew Dunstan in pgExperts on 2014-09-08 at 16:48:00
The other day I complained on the PostgreSQL hackers list about a couple of aspects of Javascript that make it quite bothersome for large scale programming, namely the old style variable scoping rules and the very limited string literals, which can't stretch across lines (don't mention the awful backslash trick, please) and don't have any provision for interpolation. If you're used as I am to Perl, which has had lexically scoped variables for about 20 years and awsome string literal support for a good deal longer than that, these things are really quite painful.

The good news if that the forthcoming standard, ECMAScript6, also known as "harmony", contains features to deal with both of these issues.

The latest versions of the V8 engine actually support harmony scoping rules, with one restriction, namely that it's only supported in contexts that are in "strict" mode. I believe that this restriction will go away in due course.

Petr Jelinek dropped me a note that other day to tell me how to set V8 flags, and based in that I have developed a patch for PLV8 that allows for harmony scoping. It requires a new GUC setting that is applied during the module's initialization code.

This is available in my clone of the plv8 code, and you can see what it's doing at https://code.google.com/r/amdunstan-harmony-scoping/source/detail?r=8acdcdabcd0c2b9ad99f66a5258920db805efdc3#

I'll do a bit more testing and then talk to the other PLV8 developers about merging it in.

Things are less rosy on the string handling front, I'm afraid. I have no idea when V8 will get the "template strings" feature that will address the string literal deficiencies. As far as I can tell nobody is working on it.
Posted by gabrielle roth on 2014-09-06 at 02:00:00

When: 7-9pm Thu Sep 18, 2014
Where: Iovation
Who: Jay Riddle
What: Using Postgresql to enable Google like Search

Jay’s been experimenting with Pg’s full text search capabilities. At our next meeting, he’ll cover the following:
* Brief intro on why Google like search capabilities are fun.
* Introduce Postgresql full text abilities. Discuss at a high level why a full text index may be a bit heavier than a normal index.
* Look at possible solutions for use cases where the data you want to index is spread across multiple columns and multiple tables.

Jay Riddle is a database administrator at Viewpoint software.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

Elevators open at 6:45 and building security closes access to the floor at 7:30.

The building is on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots.

See you there!


Posted by robert berry on 2014-09-06 at 00:00:00

Btrfs considered … helpful

Sep 6, 2014 – Portland, OR

This post takes a look at Btrfs and its transparent compression mount option for storing data files in the context of a Postgresql tpc-b benchmark.

The results suggest a limited but real performance benefit to running Postgresql with data tables on Btrfs over Ext4. A performance gain is realized when table and index sizes exceed available memory for disk cache and the system is under enough load to make checkpoints challenging enough to constrain disk read operations for continued transaction processing. Benchmarks expereienced an approximate 2X gain in transactions per second for the system under test.

A hypothesis and evidence are presented which may explain the performance differences. One possibility is that Btrfs’ fsync implementation is more selective than Ext4 mounted with data=ordered. Another possibility is that write entanglement in Ext4 exarcerbates resource contention during checkpoints, starving required disk reads. And it appears that transparent compression reduces the overall read and write load, reducing resource contention during checkpoints. Alternative theories actively solicited.


Of course computational systems are dynamic systems and meaningful benchmark results are hard. Please accept these results as meaningful only within the specific context presented. There could easily be a bifurcation point for any old configuration parameter — life will find a way to ruin your benchmark.

System Under Test

The tests were run on my desktop computer. The pgbench databases existed in a tablespace on a cheap Western Digital 3TB Network drive from Fry’s. The underlying filesystem for the tablespaces changed between tests while all other configuration options remain the same. WAL and other disk usage was on a generic Intel SSD. The computer has 32 GB of RAM, and shared buffers was set at 500MB.

- Btrfs v3.12
- 3.13.0-24-generic

For more details, below please find links pointing to hardware and configuration details.

postgresql.conf

pgbench-tools con

[continue reading]

Posted by Craig Ringer in 2ndQuadrant on 2014-09-05 at 15:12:45

RPMs for BDR (Bi-Directional Replication for PostgreSQL) are now available for testing. They contain BDR release 0.7.1, which is based on PostgreSQL 9.4beta2.

Because there may be an on-disk format change (catversion bump) before PostgreSQL 9.4.0 final, and because these RPMs are quite new, this should be considered a test release.

RPMs are available for CentOS 5, 6 & 7 and Fedora 19 & 20. The SRPMs are also published and available for inspection.

There are no Debian packages available yet.

Feedback would be greatly appreciated. Please post to pgsql-general with “BDR” in the subject and appropriate details about your distro/version, or write to bdr@2ndquadrant.com .

All packages are signed by key 0x6E192B0E which is in turn signed by the official 2ndQuadrant packaging key.

I’d like to credit the PGDG RPM team in general, and Devrim GÜNDÜZ in particular, for the original RPM packaging work that these RPMs are based on. I’ve made a lot of changes, only a few of which are actually specific to BDR, but it would’ve all taken a lot longer without being able to start with the PGDG packages. I’m progressively contributing my non-BDR-specific improvements back to PGDG to make RPM packaging easier down the track.

On 2nd of September, Andres Freund committed patch: Add psql PROMPT variable showing which line of a statement is being edited.   The new %l substitution shows the line number inside a (potentially multi-line) statement starting from one.   Author: Sawada Masahiko, heavily editorialized by me. Reviewed-By: Jeevan Chalke, Alvaro Herrera This should help some […]
On 2nd of September, Fujii Masao committed patch: Support ALTER SYSTEM RESET command.   This patch allows us to execute ALTER SYSTEM RESET command to remove the configuration entry from postgresql.auto.conf.   Vik Fearing, reviewed by Amit Kapila and me. Only this year we got “ALTER SYSTEM" command, that lets you change config values for […]
Posted by Michael Paquier in VMware on 2014-09-04 at 03:07:33

Customizing the shape of logs written by a PostgreSQL server is possible using a hook present in core code invocated before sending anything to the server logs.

This hookis present in elog.c and is defined as follows:

emit_log_hook_type emit_log_hook = NULL;
[...]
if (edata->output_to_server && emit_log_hook)
    (*emit_log_hook) (edata);

Note that it uses ErrorData as argument, structure defined in elog.h containing a lot of status information about the log entry generated.

In order to activate the hook, it is necessary to register a custom routine defined in a library loaded by core server at postmaster startup via shared_preload_libraries. The module presented in this post, named jsonlog and able to reshape log messages as JSON objects (1 object per log entry), uses the following basic infrastructure to use the logging hook correctly:

PG_MODULE_MAGIC;

void _PG_init(void);
void _PG_fini(void);

/* Hold previous logging hook */
static emit_log_hook_type prev_log_hook = NULL;

void
_PG_init(void)
{
    prev_log_hook = emit_log_hook;
    emit_log_hook = write_jsonlog;
} 

void
_PG_fini(void)
{
    emit_log_hook = prev_log_hook;
}

Then it is a matter of defining the custom routine write_jsonlog able to rewrite the log entry sent to server using for example a StringInfo structure appending new JSON fields. This code has nothing really complicated and readers are invited to have a look here for all the details. Basically what it does is initializing and finalizing correctly the JSON string, filling it with JSON fields depending on the error status fields available when the custom routine is called. Once object is built it is written out. Note that the extension is made to block other logging on server by updating output_to_server to false in ErrorData and that the strings generated as field values are made as legal JSON by using the in-core function escape_json. The same field as CSV output are as well covered in the string generation. Then, expect an increase of log volume as the field names need to be set

[continue reading]

The site apt.postgresql.org is a great resource for those who live in the Debian derived world. It keeps up to date with the latest postgresql packages and has a whole team dedicated to creating these packages. Of course, this is the Open Source world so not everyone agrees 100% with the way things are done in this project. As I noted here, there are some issues.

These issues are not to detract from otherwise excellent work but a note to those who use the repository to look for further problems. I also have a video displaying specifically what the issues are, here.

On 29th of August, Tom Lane committed patch: Add min and max aggregates for inet/cidr data types.   Haribabu Kommi, reviewed by Muhammad Asif Naeem So, the change is pretty trivial to explain – we now have min() and max() aggregates (as in: functions that you call in “GROUP BY" queries) that work on inet/cidr […]