The world's most advanced open source database
Top posters
Number of posts in the past two months
Top teams
Number of posts in the past two months
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
  • Get in touch with the Planet PostgreSQL administrators at planet at
Posted by David Z in Highgo Software on 2021-07-23 at 21:59

1. Overview

I was working on the PostgreSQL storage related features recently, and I found PostgreSQL has designed an amazing storage addressing mechanism, i.e. Buffer Tag. In this blog, I want to share with you my understanding about the Buffer Tag and some potential usage of it.

2. Buffer Tag

I was always curious to know how PostgreSQl can find out the tuple data blocks so quickly when the first time I started to use PostgreSQL in an IoT project, but I never got the chance to look it into details even though I knew PostgreSQL is an very well organized open-source project. Until I recently got a task which needs to solve some storage related issue in PostgreSQL. There is very detailed explanation about how Buffer Manger works in an online PostgreSQL books for developers The Internals of PostgreSQL, one of the best PostgreSQL books I would recommend to the beginner of PostgreSQL development to read.

Buffer Tag, in simple words, is just five numbers. Why it is five numbers? First, all the objects including the storage files are managed by Object Identifiers, i.e. OID. For example, when user creates a table, the table name is mapped to an OID; when user creates a database, the name is mapped to an OID; when the corresponding data need to be persistent on disk, the files is also named using OID. Secondly, when a table requires more pages to store more tuples, then each page for the same table is managed by the page number in sequence. For example, when PostgreSQL needs to estimate the table size before decide what kind of scan should be used to find out the tuple faster, it need to know the number of blocks information. Thirdly, it is easy to understand that data tuples are the major user data need to be stored, but in order to better manage these data tuples, PostgreSQL needs others information, such as visibility to manage the status of these data tuples, and free space to optimize files usage. So this ends up with five numbers, i.e. Tablespace, Database, Table, ForkNumber, BlockNumber.



One theme of the 3.2 release is new analytical functionality in the raster module, and access to cloud-based rasters via the "out-db" option for rasters. Let's explore two new functions and exercise cloud raster support at the same time.

Having explored one fork in the path (Elasticsearch and Kibana) in the previous pipeline blog series (here is part 5), in this blog we backtrack to the junction to explore the alternative path (PostgreSQL and Apache Superset). But we’ve lost our map (aka the JSON Schema)—so let’s hope we don’t get lost or attacked by mutant radioactive monsters.

Fork in an abandoned uranium mine
(Source: Shutterstock)

Just to recap how we got here, here’s the Kafka source connector -> Kafka -> Kafka sink connector -> Elasticsearch -> Kibana technology pipeline we built in the previous blog series:

And here’s the blueprint for the new architecture, with PostgreSQL replacing Elasticsearch as the target sink data store, and Apache Superset replacing Kibana for analysis and visualization:

Well that’s the plan, but you never know what surprises may be lurking in the unmapped regions of a disused uranium mine!

1. Step 1: PostgreSQL database

In the news recently was Instaclustr’s acquisition of Credativ, experts in the open source PostgreSQL database (and other technologies). As a result, Instaclustr has managed PostgreSQL on its roadmap, and I was lucky to get access to the internal preview (for staff only) a few weeks ago. This has all the features you would expect from our managed services, including easy configuration and provisioning (via the console and REST API), server connection information and examples, and built-in monitoring with key metrics available (via the console and REST API). 

Having a fully functional PostgreSQL database available in only a few minutes is great, but what can you do with it?

The first thing is to decide how to connect to it for testing. There are a few options for client applications including psql (a terminal-based front-end), and a GUI such as pgAdmin4 (which is what I used). Once you have pdAdmin4 running on your desktop you can easily create a new connection to the Instaclustr managed PostgreSQL server as d


My colleague Kat Batuigas recently wrote about using the powerful open-source QGIS desktop GIS to import data into PostGIS from an ArcGIS Feature Service. This is a great first step toward moving your geospatial stack onto the performant, open source platform provided by PostGIS. And there's no need to stop there! Crunchy Data has developed a suite of spatial web services that work natively with PostGIS to expose your data to the web, using industry-standard protocols. These include:

zheap has been designed as a new storage engine to handle UPDATE in PostgreSQL more efficiently. A lot has happened since my last report on this important topic, and I thought it would make sense to give readers a bit of a status update – to see how things are going, and what the current status is.

zheap: What has been done since last time

Let’s take a look at the most important things we’ve achieved since our last status report:

  • logical decoding
  • work on UNDO
  • patch reviews for UNDO
  • merging codes
  • countless fixes and improvements

zheap: Logical decoding

The first thing on the list is definitely important. Most people might be familiar with PostgreSQL’s capability to do logical decoding. What that means is that the transaction log (= WAL) is transformed back to SQL so that it can be applied on some other machine, leading to identical results on the second server. The capability to do logical decoding is not just a given. Code has to be written which can decode zheap records and turn them into readable output. So far this implementation looks good. We are not aware of bugs in this area at the moment.


zheap: Logical decoding


Work on UNDO

zheap is just one part of the equation when it comes to new storage engines. As you might know, a standard heap table in PostgreSQL will hold all necessary versions of a row inside the same physical files. In zheap this is not the case. It is heavily based on a feature called “UNDO” which works similar to what Oracle and some other database engines do. The idea is to move old versions of a row out of the table and then, in case of a ROLLBACK, put them back in .

What has been achieved is that the zheap code is now compatible with the new UNDO infrastructure suggested by the community (which we hope to see in core by version 15). The general idea here is that UNDO should not only be focused on zheap, but provide a generic infrastructure other storage engines will be able to use in the future as well. That’s why

Posted by Luca Ferrari on 2021-07-20 at 00:00

How to see the available and/or installed extensions?

PostgreSQL Extension Catalogs

There are three main catalogs that can be useful when dealing with extensions:

The former one, pg_extension provides information about which extensions are installed in the current database, while the latter, pg_available_extensions provides information about which extensions are available to the cluster.
The difference is simple: to be used an extension must appear first on pg_available_extensions, that means it has been installed on the cluster (e.g., via pgxnclient). From this point on, the extension can be installed into the database by means of a CREATE EXTENSION statement; as a result the extension will appear into the pg_extension catalog.

As an example:

testdb=> select name, default_version from pg_available_extensions;
        name        | default_version 
 intagg             | 1.1
 plpgsql            | 1.0
 dict_int           | 1.0
 dict_xsyn          | 1.0
 adminpack          | 2.1
 intarray           | 1.3
 amcheck            | 1.2
 autoinc            | 1.0
 isn                | 1.2
 bloom              | 1.0
 fuzzystrmatch      | 1.1
 jsonb_plperl       | 1.0
 btree_gin          | 1.3
 jsonb_plperlu      | 1.0
 btree_gist         | 1.5
 hstore             | 1.7
 hstore_plperl      | 1.0
 hstore_plperlu     | 1.0
 citext             | 1.6
 lo                 | 1.1
 ltree              | 1.2
 cube               | 1.4
 insert_username    | 1.0
 moddatetime        | 1.0
 dblink             | 1.2
 earthdistance      | 1.1
 file_fdw           | 1.0
 pageinspect        | 1.8
 pg_buffercache     | 1.3
 pg_freespacemap    | 1.2
 pg_prewarm         | 1.2
 pg_stat_statements | 1.8
 pg_trgm            | 1.5
 pg_visibility      | 1.2
 pgcrypto           | 1.3
 pgrowlocks         | 1.2
 pgstattuple        | 1.5
 postgres_fdw       | 1.0
Posted by Paul Ramsey in Crunchy Data on 2021-07-19 at 15:01

One of the less visible improvements coming in PostGIS 3.2 (via the GEOS 3.10 release) is a new algorithm for repairing invalid polygons and multipolygons.

Algorithms like polygon intersection, union and difference rely on guarantees that the structure of inputs follows certain rules. We call geometries that follow those rules "valid" and those that do not "invalid".

Posted by Andreas 'ads' Scherbaum on 2021-07-19 at 14:00
PostgreSQL Person of the Week Interview with Rafia Sabih: I am basically from India, currently living in Berlin, Germany. I started my PostgreSQL journey in my masters and continued it by joining EDB, India. Then, I increased my spectrum to Postgres on Kubernetes working at Zalando.

A quickstart guide to create a web map with the Python-based web framework Django using its module GeoDjango, the PostgreSQL database with its spatial extension PostGIS and Leaflet, a JavaScript library for interactive maps.

Some time ago I wrote about new options for explains – one that prints settings that were modified from default. This looks like this: Aggregate (cost=35.36..35.37 rows=1 width=8) -> Index Only Scan using pg_class_oid_index on pg_class (cost=0.27..34.29 rows=429 width=0) Settings: enable_seqscan = 'off' Finally, today, I pushed a change that displays them on To … Continue reading "Display “settings” from plans on"

Logical Replication was introduced in PostgreSQL-10 and since then it is being improved with each version. Logical Replication is a method to replicate the data selectively unlike physical replication where the data of the entire cluster is copied. This can be used to build a multi-master or bi-directional replication solution. One of the main differences as compared with physical replication was that it allows replicating the transaction only at commit time. This leads to apply lag for large transactions where we need to wait to transfer the data till the transaction is finished. In the upcoming PostgreSQL-14 release, we are introducing a mechanism to stream the large in-progress transactions. We have seen the replication performance improved by 2 or more times due to this for large transactions especially due to early filtering. See the performance test results reported on hackers and in another blog on same topic. This will reduce the apply lag to a good degree.

The first thing we need for this feature was to decide when to start streaming the WAL content. One could think if we have such a technology why not stream each change of transaction separately as and when we retrieve it from WAL but that would actually lead to sending much more data across the network because we need to send some additional transaction information with each change so that the apply-side can recognize the transaction to which the change belongs. To address this, in PostgreSQL-13, we have introduced a new GUC parameter logical_decoding_work_mem which allows users to specify the maximum amount of memory to be used by logical decoding, before which some of the decoded changes are either written to local disk or stream to the subscriber. The parameter is also used to control the memory used by logical decoding as explained in the blog.

The next thing that prevents incremental decoding was the delay in finding the association of subtransaction and top-level XID. During logical decoding, we accumulate all changes along with its (s

Posted by Dinesh Chemuduru in MigOps on 2021-07-16 at 22:38

In December 2020, you might have seen an article from CentOS about shifting their focus towards CentOS stream, which is the upstream version of the RHEL. CentOS also mentioned that the version 8 would be EOL (end of life) by the end of the 2021. This means that it will no longer receive any updated fixes from it's upstream version of RHEL. A few days after this announcement, Rocky Linux was announced by the CentOS founder, Gregory Kurtzeras, as a 100% bug-for-bug compatible with RHEL. The Rocky Linux project quickly gained so much of attention, and also got sponsors from the cloud vendors like AWS, Google Cloud and Microsoft. We wanted to take this opportunity and write an article about CentOS vs Rocky Linux Benchmark with PostgreSQL.


As of now, CentOS is widely used in productions, because it was a downstream version of RHEL. This means that the CentOS was receiving all the RHEL criticial bug fixes for free, which makes CentOS as robust and reliable as RHEL. The future of CentOS project is it's Stream version, which is the upstream of RHEL. This means that CentOS Stream may not be receiving any such critical bug fixes from RHEL, instead CentOS Stream bug fixes will be pushed down to the RHEL project.

CentOS vs Rocky Linux Benchmark

After seeing the Rocky Linux Project announcement, we wanted to run some benchmark regarding PostgreSQL and see if we get the same performance as CentOS 8 with Rocky Linux 8. To run this benchmark, we have chosen the phoronix tool, which offers a big list of benchmarking test suites. By using this phoronix tool, we will be running a few set of general benchmarks besides to PostgreSQL's pgbench.

Phoronix test suites

Phoronix is a system benchmarking tool which offers a big list of test suites. This tool provides test suites for the CPU, MEMORY, DISK, COMPILE, etc. for most of the operating systems. We will be using this tool to perform the benchmarking on both Rocky Linux and CentOS. For the purpose of this article, we have considered run the following

Today I released pspg 5.1.0. Mostly this is bugfix and refactoring release, but there is one, I hope, interesting function. You can try to press Ctrl o for temporal switch to terminal's primary screen. In primary screen you can see psql session. After pressing any key, the terminal switch to alternative screen with pspg.

Thanks to Tomas Munro work, the psql \watch command will supports pagers (in PostgreSQL 15). In this time only pspg can do this work (in streaming mode). When you set environment variable PSQL_WATCH_PAGER, the \watch command redirects otputs to specified pager (for pspg export PSQL_WATCH_PAGER="pspg --stream". Next you can run command:

select * from pg_stat_database \watch 5

select * from pg_stat_activity where state='active' \watch 1
So, some time ago, Pg devs added multi ranges – that is datatype that can be used to store multiple ranges in single column. The thing is that it wasn't really simple how to get list of ranges from within such multirange. There was no operator, no way to split it. A month ago Alexander … Continue reading "How to get list of elements from multiranges?"
Posted by Aya Iwata in Fujitsu on 2021-07-15 at 01:04

The usability of the libpq feature to trace application's server/client communications has been enhanced in PostgreSQL 14, with an improved format and an option to control output.

Posted by Egor Rogov in Postgres Professional on 2021-07-15 at 00:00

We've already discussed some object-level locks (specifically, relation-level locks), as well as row-level locks with their connection to object-level locks and also explored wait queues, which are not always fair.

We have a hodgepodge this time. We'll start with deadlocks (actually, I planned to discuss them last time, but that article was excessively long in itself), then briefly review object-level locks left and finally discuss predicate locks.


When using locks, we can confront a deadlock. It occurs when one transaction tries to acquire a resource that is already in use by another transaction, while the second transaction tries to acquire a resource that is in use by the first. The figure on the left below illustrates this: solid-line arrows indicate acquired resources, while dashed-line arrows show attempts to acquire a resource that is already in use.

To visualize a deadlock, it is convenient to build the wait-for graph. To do this, we remove specific resources, leave only transactions and indicate which transaction waits for which other. If a graph contains a cycle (from a vertex, we can get to itself in a walk along arrows), this is a deadlock.


Posted by Luca Ferrari on 2021-07-15 at 00:00

I did some more experiments with WALs.

How much data goes into the WALs? (part 2**

In order to get a better idea about how WAL settings can change the situation within the WAL management, I decided to run a kind of automated test and store the results into a table, so that I can query them back later.
The idea is the same of my previous article: produce some workload, meausere the differences in the Log Sequence Numbers, and see how the size of WALs change depending on some setting. This is not an accurate research, it’s just a quick and dirty experiment.

At the end, I decided to share my numbers so that you can have a look at them and elaborate a bit more. For example, I’m no good at all at doing graphs (I know only the very minimum about gnuplot!).

!!! WARNING !!!

WARNING: this is not a guide on how to tune WAL settings! This is not even a real and comprhensive set of experiments, it is just what I’ve played with to see how much traffic can be generated for certain amount of workloads.
Your case and situation could be, and probably is, different from the very simple test I’ve done, and I do not pretend to be right about the small and obvious conclusions I come up at the end. In the case you see or know something that can help making more clear what I write in the following, please comment or contact me!

Set up

First of all I decided to run an INSERT only workload, so that the size of the resulting table does not include any bloating and is therefore comparable to the effort about the WAL records.
No other database activity was ongoing, so that the only generated WAL traffic was about my own workload.
Each time the configuration was changed, the system was restarted, so that every workload started with the same (empty) clean situation and without any need to reason about ongoing checkpoints. Of course, checkpoints were happening as usual, but not at the beginning of the workload.

I used two tables to run the test:

  • wal_traffic stores the results of each ru

Too often, web tiers are full of boilerplate that does nothing except convert a result set into JSON. A middle tier could be as simple as a function call that returns JSON. All we need is an easy way to convert result sets into JSON in the database.

PostgreSQL has built-in JSON generators that can be used to create structured JSON output right in the database, upping performance and radically simplifying web tiers.

Fortunately, PostgreSQL has such functions, that run right next to the data, for better performance and lower bandwidth usage.

Most people in the SQL and in the PostgreSQL community have used the LIMIT clause provided by many database engines. However, what many do not know is that LIMIT / OFFSET are off standard and are thus not portable. The proper way to handle LIMIT is basically to use SELECT … FETCH FIRST ROWS. However, there is more than meets the eye.


Before we dig into some of the more advanced features we need to see how LIMIT and FETCH FIRST ROWS can be used. To demonstrate this feature, I have compiled a simple data set:

test=# CREATE TABLE t_test (id int);
test=# INSERT INTO t_test 
VALUES  (1), (2), (3), (3), 
(4), (4), (5);
test=# TABLE t_test;
(7 rows)

Our data set has 7 simple rows. Let’s see what happens if we use LIMIT:

test=# SELECT * FROM t_test LIMIT 3;
(3 rows)

In this case, the first three rows are returned. Note that we are talking about ANY rows here. Whatever can be found first is returned. There is no special order.

The ANSI SQL compatible way of doing things is as follows:

test=# SELECT * 
           FROM  t_test 
(3 rows)

Many of you may never have used or seen this kind of syntax before, but this is actually the “correct” way to handle LIMIT.

However, there is more: What happens if NULL is used inside your LIMIT clause? The result might surprise you::

test=# SELECT * FROM t_test LIMIT NULL;
(7 rows)

The database engine does not know when to stop returning rows. Remember, NULL is undefined, so it does not mean zero. Therefore, all rows are returned. You have to keep that in mind in order to avoid unpleasant surprises…


WITH TIES has been introduced in PostgreSQL 13 and fixes a common problem: handling duplicates. If you fetch the first couple of rows, PostgreSQL stops at a fixed number of rows. However, what happens if the same data

Posted by Neil Chen in Highgo Software on 2021-07-14 at 02:34

During my most recent expedition of going through PostgreSQL source code, I ran into some mysterious variables in the page structure of some indexes. I will explain later why i am calling these mysterious but first have a look at these variables…

typedef struct HashPageOpaqueData
uint16 hasho_page_id; /* for identification of hash indexes */
} HashPageOpaqueData;


typedef struct SpGistPageOpaqueData
uint16 spgist_page_id; /* for identification of SP-GiST indexes */
} SpGistPageOpaqueData;

You may find that in Postgres’s kernel code, they just give them a constant, but they don’t really use them anywhere (well as far as i have looked). That was my reason for calling these variables mysterious but finally found a clue in the code comment of the macro definition SPGIST_PAGE_ID.

The page ID is for the convenience of pg_filedump and similar utilities, which otherwise would have a hard time telling pages of different index types apart. 

So here comes my reason for writing this blog, this is about the pg_filedump utility, it will provide with an introduction to the utility and demo of basic functionality it provides.

Understanding and using pg_filedump

pg_filedump is a utility to format PostgreSQL heap/index/control files into a human-readable form. 

So without further ado, let’s dig into it by first operating it according to the introduction in the README to intuitively understand its functions:

Create test data:

CREATE TABLE test(id int, name text);

INSERT INTO test values(001, 'Aerith'), (002, 'Cloud'), (003, 'Tifa');

View file path:

SELECT pg_relation_filepath(‘test’);
(1 row)

Important: remember to execute checkpoint to ensure that the file contents are written to disk.

Using pg_filedump viewing files:

% pg_filedump $PGDATA/base/14236/164333

* PostgreSQL File/Bloc
Posted by Luca Ferrari on 2021-07-13 at 00:00

What is the amount of traffic generated in the Write Ahead Logs?

How much data goes into the WALs?

PostgreSQL exploits the Write Ahead Logs (WALs) to make data changes persistent: whenever you COMMIT (implicitly or explicitly) a work, the data is stored in the WALs before it phisically hits the table it belongs to.
There are different advantages in this approach, most notably performances and the ability to survive a crash.
And one beautiful thing about PostgreSQL is that it provides you all the tools to follow, study and understand what it is happening under the hood. With regard to the WALs, there are few pg_wal_xxx functions that can be exploited to get a clue about what is happening in the WALs.
In this post I’m going to use mainly:

  • pg_current_wal_lsn() that provides the current offset within the WAL stream where the next thing will happen. Such offset in the WAL stream is called Log Sequence Number or LSN for short;
  • pg_walfile_name() that given a Log Sequence Number (LSN) provides you the name of the WAL file, in the pg_wal directory, that contains the WAL location.

It is worth spending a little time to explain what LSNs are.
PostgreSQL organizes the WALs into files large 16 MB each (you can change this setting, but assume you will not). Every time a WAL file is full, that is it contains 16 MBG of valid WAL data, PostgreSQL produces a new file (or recycles a no more used one).
The database must know exactly when things happened during the history of transactions, and this means it must be able to point to a location into the WAL files to clearly identify a transaction, or a statement, or something else. This location is expressed a Log Sequence Number, something that points the server to an offset within the WAL stream.
Therefore, when you execute an SQL statement, the database stores the result of the statement into the WALs at the position indicated by the current log sequence number, and the next statement will happen at a different log sequence numbe

Posted by Michał Mackiewicz on 2021-07-12 at 16:51

Even if logical replication is a method of major upgrade itself, sometimes you’ll want to mix and match them. Consider the following scenario: a big and busy database that has a logical replica, which has its specific tables and indexes. When it’s time to upgrade, you will probably want to upgrade both instances – the publisher and the subscriber. As the database is big an busy, using logical replication itself or dump-and-restore aren’t the best choices – those methods will require massive amount of storage, and dump-and-restore will also require a long period of downtime. So, the most viable option is to use the pg_upgrade tool, capable of upgrading a multi-terabyte cluster in the matter of minutes with –link option. But…

pg_upgrade will destroy all replication slots

After successful pg_upgrade run, the new cluster will have no replication slots. That means if you allow your users and applications to connect to the upgraded clusters immediately, any data changes won’t be replicated to subscribers. So, it’s very important to ensure no data is written to the publisher database until logical replication is up and running.

Things to do when upgrading a publisher-subscriber environment

So, how to correctly run pg_upgrade without losing any data changes? Those steps worked for me:

  1. Block all traffic to the primary database from apps and users – using firewall settings, the “host all all reject” rule in pg_hba.conf file, or both. Allow only replication connections from the subscriber(s).
  2. Check if there is any replication lag, and when it reaches zero, run DROP SUBSCRIPTION on the subscriber. This will drop the replication slot on publisher – but it would be removed by pg_upgrade anyway.
  3. Upgrade the subscriber cluster.
  4. Upgrade the publisher cluster.
  5. On the subscriber, CREATE SUBSCRIPTION … WITH (copy_data=FALSE)
  6. Check if the logical replication works as expected, for example add a bogus row to some replicated table, check for its existence on subscriber and delet
Posted by Andreas 'ads' Scherbaum on 2021-07-12 at 14:00
PostgreSQL Person of the Week Interview with Jean-Christophe Arnu: Hi, I’m Jean-Christophe. I live in the south of France, near Toulouse. As far as I can remember, I have always been interested in computer science and programming. I moved a lot when I was a child but I settled in the Toulouse area when I started university. I am the father of 3 children, which is my first full time job, my second being a PostgreSQL database consultant in a PostgreSQL and Cloud dedicated company (we all are remote workers): LOXODATA.
Posted by Alexey Lesovsky on 2021-07-12 at 13:24

New pgSCV 0.7.0 has been released, with new features, bug fixes and minor improvements.

Basic authentication and TLS support. When pgSCV have to be installed in public or untrusted networks, it can be protected with TLS and basic authentication. Basic auth restricts access and TLS encrypts communication. It is strongly recommended to use both and TLS and basic auth — without using TLS, auth credentials could be easily intercepted. Checkout the wiki for configuration details.

Patroni support. Patroni is the widely used tool for Postgres high availability and often used in Postgres infrastructure. Added auto-discovery for local Patroni services with collecting metrics when it is found. But, few days ago, new version of Patroni has been released with built-in metrics. Hmm… ok, I’m going to keep Patroni metrics on pgSCV, also added compatibility and keep some metrics which are not present in Patroni about time of latest timeline change.

Refactor the collecting of Postgres metadata. You might be know that different version of Postgres have different stats features (view and functions), and for collecting metrics, pgSCV have to know extra details about Postgres. For example, Postgres version, installed pg_stat_statements (database and schema), status of recovery, data block size, and etc. Earlier versions of pgSCV are collect metadata at startup. This approach has a downside — this state of Postgres might change — version might be upgraded, pg_stat_statements reconfigured, block size recompiled, recovery enabled and so on. Hence, pgCSV has to always have fresh metadata. New pgSCV version updates metadata at every metrics collecting. This adds some tiny overhead (a few SQL queries), which is negligible in fact.

Remove an internal query normalization. This normalization was used to overcome drawbacks of pg_stat_statements’ native normalization. But using this extra normalization adds some complexity — queries normalized by pgSCV become are not the same as in pg_stat_statements. This might confuse. In n


1. Introduction

I am working on a new PostgreSQL feature that redefines the way a tuple’s visibility status is determined. The feature is working very nicely until I start doing a large SELECT query, which triggers PostgreSQL to spawn multiple parallel workers to process the request. When this happens, the feature I am working on start to yield incorrect results. A good portion of the data tuples returned are missing because they are considered as invisible, while some portion of it remains visible. It immediately came to my attention that the new feature I am working on does not work in parallel worker mode and somehow I need to find a way to debug into a spawned parallel worker to examine how it is computing the visibility and what is missing inside.

In this blog, I would like to share with you how I use GDB to debug and trace into a new parallel worker spawned by Postmaster in order to fix the visibility issue.

2. GDB Basics

I wrote another blog previously that shows how to use GDB to trace and debug a PostgreSQL issues and share some of the most common commands that I use every day to resolve software issues. If you are new to GDB, I suggest giving this blog a read here

3. How and When does PG Spawn A New Parallel Worker

When you use psql to connect to a PostgreSQL database, it will spawn a new backend worker process to serve this connecting client. Most of the queries you provide will be processed by this backend process, includes SELECT, UPDATE, INSERT…etc. By default, if your SELECT query will require doing a sequential scan over 8MB of data, it will try to use a parallel worker to help speed up the processing. This 8MB threshold can be configured by the min_parallel_table_scan_size parameter in postgresql.conf . There is another configuration parameter max_parallel_workers that controls the maximum number of parallel workers is allowed to be spawned. The default is 8.

Technically, I can avoid my visibility issues simply by either setting min_parallel_table_scan_size to a hug


We recently announced the release of version 5.0 of  PGO, the open source Postgres Operator from Crunchy Data. In this previous post, I discussed the design decisions and architecture around building the next generation of Kubernetes Native Postgres Operator. Now let's further dive into the feature set and demonstrate how easy it is to get started

With many years of active development, PGO put forward a strong feature set for managing open source Postgres databases on Kubernetes as conveniently as possible. Our goal was to combine the core features needed to run Postgres in production and our lessons running Postgres on Kubernetes to create a Kubernetes Operator built for the declarative workflow.

An important design goal for PGO 5.0 was to make it as easy as possible to run production-ready Postgres with the features that one expects. Let's see what it takes to run cloud native Postgres that is ready for production.

Cloud Native Postgres Ready for Production

There are a certain set of features that are essential to running any database, let alone Postgres in production. We had implemented all of these features in previous versions of the Postgres Operator and moved them into a fully declarative workflow for PGO 5.0:

LATERAL joins are one of the lesser-known features of PostgreSQL and other relational databases such as Oracle, DB2 and MS SQL. However, LATERAL joins are a really useful feature, and it makes sense to take a look at what you can accomplish with them.

Inspecting FROM more closely

Before we dive into LATERAL, it makes sense to sit back and think about SELECT and FROM clauses in SQL on a more philosophical level. Here is an example:

SELECT whatever FROM tab;

Basically, we could see this statement as a loop. Writing this SQL statement in pseudo code would look somewhat like the following snippet:

for x in tab
     “do whatever”
end loop


For each entry in the table, we do what the SELECT clause says. Usually data is simply returned as it is. A SELECT statement can be seen as a loop. But what if we need a “nested” loop? This is exactly what LATERAL is good for.

LATERAL joins: Creating sample data

Let’s imagine a simple example. Imagine we have a line of products, and we’ve also got customer wishlists. The goal now is to find the best 3 products for each wishlist. The following SQL snippet creates some sample data:

    SELECT   id AS product_id,
             id * 10 * random() AS price,
             'product ' || id AS product
    FROM generate_series(1, 1000) AS id;

CREATE TABLE t_wishlist
    wishlist_id        int,
    username           text,
    desired_price      numeric

    (1, 'hans', '450'),
    (2, 'joe', '60'),
    (3, 'jane', '1500')

The product table is populated with 1000 products. The price is random, and we used a pretty creative name to name the products:

test=# SELECT * FROM t_product LIMIT 10;
 product_id | price              | product
          1 | 6.756567642432323  | product 1
          2 | 5.284467408540081  | product 2
          3 | 28.284196164210904 | product 3
          4 | 13.543868035690423 | product 4
          5 | 30.576923884383

pgpool-II is a well-known tool to pool PostgreSQL connections and load-balance work loads. In this blog, we will verify whether pgpool works well on ARM64 architecture.

ARM64 packages

On Ubuntu, pgpool2 ARM64 debian package is made available by the Debian PostgreSQL Maintainers :

$ dpkg -s pgpool2
Maintainer: Debian PostgreSQL Maintainers
Architecture: arm64
Version: 4.1.4-3.pgdg18.04+2
Replaces: pgpool
Depends: libpgpool0 (= 4.1.4-3.pgdg18.04+2), lsb-base (>= 3.0-3), postgresql-common (>= 26), ucf, libc6 (>= 2.17), libmemcached11, libpam0g (>=, libpq5, libssl1.1 (>= 1.1.0)

On CentOS, the ARM64 yum packages are not available. But you can always download the source from the repository or using git and then install from the source.

Installation from source

pgpool build requires installed PostgreSQL on the same machine. The build uses 'pg_config' command to locate the required PostgreSQL header files, library files, and executables. So ensure that your PATH environment variable has the PostgreSQL bin directory so that './configure' pickups up the right pg_config :

export PATH=/bin:$PATH
$ ./configure --prefix=
pgpool will be installed at the location specified by --prefix.

Alternatively, you can explicitly specify the required PostgreSQL directories :
$ ./configure --prefix= --with-pgsql-includedir=$PGDIR/include --with-pgsql-libdir=$PGDIR/lib -with-pgsql-bindir=$PGDIR/bin

If you downloaded the source from the git repository, you need to set the branch to the latest stable one :
$ git branch --track V4_2_STABLE remotes/origin/V4_2_STABLE
$ git checkout V4_2_STABLE
$ git branch

Now install pgpool2:
$ make
$ make install

Create a pcp.conf file out of the supplied sample file:
cp ./etc/pcp.conf.samp


We're excited to announce the release of PGO 5.0, the open source Postgres Operator from Crunchy Data. While I'm very excited for you to try out PGO 5.0 and provide feedback, I also want to provide some background on this release.

When I joined Crunchy Data back in 2018, I had heard of Kubernetes through my various open source activities, but I did not know much about it. I learned that we had been running Postgres on Kubernetes and OpenShift in production environments for years. This included the release of one of the first Kubernetes Operators! It was quite remarkable to see how Crunchy lead the way in cloud native Postgres. I still remember how excited I was when I got my first Postgres cluster up and running on Kubernetes!

Many things have changed in the cloud native world over the past three years. When I first started giving talks on the topic, I was answering questions like, "Can I run a database in a container?" coupled with "Should I run a database in a container?" The conversation has now shifted. My colleague Paul Laurence wrote an excellent article capturing the current discourse. The question is no longer, "Should I run a database in Kubernetes" but "Which database should I run in Kubernetes?" (Postgres!).

Along with this shift in discussion is a shift in expectation for how databases should work on Kubernetes. To do that, we need to understand the difference between an imperative workflow and a declarative workflow.

Cloud Native Declarative Postgres

Posted by Alexander Sosna on 2021-07-07 at 00:01
Intro to Memory Management Most software can not predict the exact memory needs during compile time, so memory is dynamically allocated during run time. When a programm needs more memory it requests a new chunk of memory from the operating system. The OS can either grant or deny the request. Most modern software does not always care very much about the amount that is actually needed and tries to get a lot more than is currently necessary - just to have some spare for bad times.