Latest Blog Posts

Monitoring a PostgreSQL Patroni Cluster
Posted by Agustin Gallego in Percona on 2023-06-09 at 13:42
Monitoring a PostgreSQL Patroni Cluster

Percona Monitoring and Management (PMM) boasts many functionalities that support its extension, be it by using Custom Queries, Custom Scripts, or by collecting data from already available External Exporters.

In this short blog post, we will see how to quickly (and easily) monitor a PostgreSQL cluster managed by Patroni. I will assume you already have an environment set up and running in which the PMM client is already monitoring the PostgreSQL database. I will assume a three-node PostgreSQL cluster, but it will, of course, work with other setups. If needed, refer to the online documentation on this:

Adding the Patroni metrics to PMM

Since version 2.1.0, Patroni exposes metrics in a Prometheus-compatible way via the /metrics endpoint: 

This means that we can instantly benefit from them by using the PMM External Exporters feature:

In a nutshell, we just need to run the following in each client node:

shell> pmm-admin add external --listen-port=8008 --service-name=pg-nodeX-patroni

Where –service-name can be whatever naming scheme you want that lets you easily know which Patroni node it refers to.

After running these commands and waiting some seconds, we should see metrics incoming in the Advanced Data Exploration dashboard, like the patroni_primary one:

Patroni Advanced Data Exploration dashboard

Up to now, no Patroni dashboard would use these metrics. So I took the chance while working on a customer ticket to improve this and created my own dashboard (which is proof of the power behind PMM’s extensibility, in my opinion).

Importing the new Patroni Dashboard

I have uploaded the dashboard to Grafana Labs so it’s easy for everyone to import and use. You can use the ID 18870 to do so in three easy steps.

1- Click on the Import dashboard button in PMM:


pgBackRest File Bundling and Block Incremental Backup
Posted by David Steele in Crunchy Data on 2023-06-09 at 13:00

Crunchy Data is proud to support the pgBackRest project, an essential production grade backup tool used in our fully managed and self managed Postgres products. pgBackRest is also available as an open source project.

pgBackRest provides:

  • Full, differential, and incremental backups
  • Checksum validation of backup integrity
  • Point-in-Time recovery

pgBackRest recently released v2.46 with support for block incremental backup, which saves space in the repository by storing only changed parts of files. File bundling, released in v2.39, combines smaller files together for speed and cost savings, especially on object stores.

Efficiently storing backups is a major priority for the pgBackRest project but we also strive to balance this goal with backup and restore performance. The file bundling and block incremental backup features improve backup and, in many cases, restore performance while also saving space in the repository.

In this blog we will provide working examples to help you get started with these exciting features.

File bundling

  • combines smaller files together
  • improves speed on object stores like S3, Azure, GCS

Block incremental backup

  • saves space by storing only changed file parts
  • improves efficiency of delta restore

Sample repository set up

To demonstrate these features we will create two repositories. The first repository will use defaults. The second will have file bundling and block incremental backup enabled.

Configure both repositories:





Create the stanza on both repositories:

$ pgbackrest --stanza=demo stanza-create

The block incremental backup feature is best demonstrated with a larger dataset. In particular, we would prefer to have at least one table that is near the maximum segment size of


PostgreSQL 15: a gem from the community
Posted by Edco Wallet on 2023-06-07 at 13:13

CI/CD with Crunchy Postgres for Kubernetes and Argo
Posted by Bob Pacheco in Crunchy Data on 2023-06-07 at 13:00

Continuous Integration / Continuous Delivery (CI/CD) is an automated approach in which incremental code changes are made, built, tested and delivered. Organizations want to get their software solutions to market as quickly as possible without sacrificing quality or stability. While CI/CD is often associated with application code, it can also be beneficial for managing changes to PostgreSQL database clusters.

GitOps plays an important part in enabling CI/CD. If you are unfamiliar with GitOps, I recommend starting with my previous post on Postgres GitOps with Argo and Kubernetes.

Today I have a working sample to walk through for implementing basic CI/CD using Crunchy Postgres for Kubernetes, ArgoCD, and a self test container. In the following steps we will:

  • Deploy a Postgres cluster to a developer namespace
  • Run a series of tests on the deployed cluster
  • Once the tests pass we will automatically deploy the same postgres cluster to a QA namespace.


  • Crunchy Postgres for Kubernetes v5.3 or later.
  • The sidecar functionality should be enabled. To enable the sidecar functionality you will need to add the following to the Deployment.spec.template.spec.containers.env section of the manager.yaml file located in the postgres-operator-examples/kustomize/install/manager directory.
  value: 'InstanceSidecars=true'
  • ArgoCD v 2.6 or later deployed in the Kubernetes cluster.
  • A private container registry containing the images you want to deploy. Most organizations will pull images, tag them and then upload them into their private registries. For this blog I am using a private registry for all images except the self test. That image is in a public repo in my docker registry.
  • A git repository containing the Crunchy Postgres for Kubernetes manifest to be deployed. Here's a sample manifest you can use or you can fork my git repository.
- kustomization.yaml
  # - argocd-token.yaml
  - hippo-self-test-config.yaml

PGSQL Phriday #009 Roundup
Posted by Dian Fay on 2023-06-07 at 00:00

Another Phriday in the books, and it's time to see what all happened:

Thanks everyone for participating, and look forward to Alicja's invitation coming around the end of the month!

TIL - Filling prepared statement placeholders automatically with pgbadger
Posted by Kaarel Moppel on 2023-06-06 at 21:00
Hey Postgres friends! Thought I’ll try out a new shorter broadcasting format as seems finding some hours for proper blog posts is getting increasingly harder and harder. So here a first “Today I learned” type of post on a very useful feature of pgbadger that I just discovered for myself...

Using NGINX as a PostgreSQL Reverse Proxy and Load Balancer
Posted by Tristen Raab in Highgo Software on 2023-06-06 at 20:44


This blog will go over the use cases of setting up NGINX (Engine X) as both a reverse proxy and load balancer for PostgreSQL. NGINX is an excellent feature-rich and open-source project that can help make an application more cohesive. It does this by exposing a single port to the user and proxy requests to different applications in a project based on the URI. This can make it appear as though our entire application is hosted on a single node while in reality it can be distributed anywhere around the globe.

For the examples below we assume that the PostgreSQL server has some HTTP server sitting in front of it that can accept REST API requests and translate them to PostgreSQL-compatible statements.

NGINX Reverse Proxy

Fig 1. NGINX Reverse Proxy Overview

Figure 1. shows a block diagram with an overview of how NGINX will proxy different requests to different applications. The file nginx.conf controls the configuration parameters of NGINX. It is typically located in /etc/nginx/nginx.conf. To create a proxy like in Figure 1. above we could add the following to our nginx.conf:

events {
        worker_connections 1024;

http {
       server {
              listen localhost:55432;

              location / {
                     proxy_pass localhost:5433;

              location /auth {
                     proxy_pass 8080;
              location /stats {
                     proxy_pass 8081;

Once the configuration is set run:

$ sudo nginx -s reload

This will reload NGINX and apply the changes we’ve made.

From top to bottom, let’s go over all the pieces of the NGINX configuration file.

First, at the very top is the “events” context, this is used to set global parameters that dictate how NGINX runs at a general level. Within this context is the worker_connections parameter, this dictates how many simultaneous connections an NGINX process can have.

Next is the “http” context, which co


What is a schema in PostgreSQL?
Posted by Hans-Juergen Schoenig in Cybertec on 2023-06-06 at 08:00

One way to organize data in PostgreSQL is to make use of schemas. What is a schema in PostgreSQL? And more importantly: What is the purpose of a schema and how can schemas be used to make your life easier? Let’s dive in and find out.

The purpose of a schema

Before you figure out how to use schemas, you need to know what the purpose of a schema is in the first place. To understand that, first take a look at how PostgreSQL is structured:

  • Instance
  • Database
  • Schema
  • Table
  • Row

An “instance” is basically what you start when you deploy PostgreSQL. The next layer is a database. In reality this is what you connect to: in PostgreSQL a connection is always bound to a database inside an instance, which happens early on, right after user authentication.
What is important is the next layer down, between databases and tables: Schemas.

Schemas group tables

Basically, schemas are a way to group tables together. Let’s assume there’s a fairly large data structure: Having 500 tables in one place is certainly harder to manage and understand than to have 10 buckets containing 50 tables each.
It’s simply like organizing pictures: You wouldn’t put all of them into the same folder, but rather group them by year, location, etc. The same logic can be applied to tables.

Schemas and PostgreSQL

Now we can focus on how this concept can be applied to PostgreSQL. The first thing we have to look at is the public schema.

Using the “public” schema

The beauty of PostgreSQL is that it doesn’t matter much if you know nothing at all about schemas. The reason is the existence of the public schema, which is there by default. How can we find out which schemas there are in PostgreSQL? psql provides the \dn command to display this information:

demo=# \dn
      List of schemas
  Name  |       Owner       
 public | pg_database_owner
(1 row)

In a default scenario, a table will end up in the public schema. Here’s an example:

demo=# CREATE TABLE t_product (

Sharan Foga
Posted by Andreas 'ads' Scherbaum on 2023-06-05 at 14:00
PostgreSQL Person of the Week Interview with Sharan Foga: Hi I’m Sharan - I’m originally from the UK and like travelling. I’ve been lucky enough to have lived and worked in different countries. So far I’ve lived in France, Belgium, New Zealand, Czech Republic and am now based in Sweden.

Configuring PgBouncer for Multi-Port Access
Posted by Robert Bernier in Percona on 2023-06-05 at 13:20
PgBouncer for Multi-Port Access

From time to time, situations occur where unusual circumstances dictate out-of-the-box thinking.

For example, suppose you have a system where you’ve installed multiple data clusters onto a single host. What you end up with is a host breaking up and sharing valuable system resources, i.e., CPU, RAM, disk, etc., between multiple instances of PostgreSQL data clusters.  While easy to do in a development environment, it does represent technical debt when it reaches production. Sooner or later, one must address this issue; otherwise, one can suffer the consequences of handicapping your entire database infrastructure.

Let’s now move forward in time: your system has scaled, of course, and this shortcut of using multiple data clusters on a single host has now become a performance bottleneck. The problem is you either don’t or can’t refactor your application servers; something, maybe, about not having enough time in the day. And, as you may already know, while Postgres can sit on both a UNIX DOMAIN socket and IPv4, IPv6 port, etc., one is nevertheless constrained to listen to just the one port.

So what do you do?

For the experienced sysadmin, there are actually quite a number of “magical” techniques. However, in this case, with a little help from systemd, which manages all service processes, we will solve this little puzzle using PgBouncer with a concoction of configuration files.


Configure the system such that Postgres resides on its default port of 5432 and PgBouncer sits on three ports, i.e., 6432, 6433, and 6433, accessing the resident Postgres server.

The PgBouncer connection pooler will use an administrative account, the ROLE PgBouncer, for the purpose of user authentication. Authentication is to be achieved by accessing the Postgres server’s pg_shadow table and comparing the resultant hash to all incoming connections (this won’t work for cloud setups such as, for example, Amazon RDS).

A set of Systemd configuration files will be created and edited in order to manage the P


High-compression Metrics Storage with Postgres Hyperloglog
Posted by Christopher Winslett in Crunchy Data on 2023-06-05 at 13:00

We have been talking a lot here about using Postgres for metrics, dashboards, and analytics. One of my favorite Postgres tools that makes a lot of this work easy and efficient is Hyperloglog (HLL). Hyperloglog is like Regex, once you understand it -- you feel like it's a superpower. Also, like Regex -- it can't solve everything. In this post I’ll take you through how to get started with HLL and build some sample queries, and get started with simple tuning.

So what is Hyperloglog?

Hyperloglog is a compression and extraction algorithm for counting distinct values. It gets the name from its disk size characteristic of log(log(x)) -- hence the "loglog". Because it's a compression algorithm, results are an approximation. During storage time, it converts values using a hash, adds those to a summation, and converts them to binary. Then, during read time, it groups the binary values to approximate values based on the length of zeroes.

When to use Hyperloglog?

Because it’s an approximation, the use case should accept approximate calculations. Billing systems? No! Visitor tracking? Yes!

To use Hyperloglog, you must have data that counts distinct values across a time series or set of values. Typically, you can use it where you are currently using COUNT(DISTINCT ). The canonical example for Hyperloglog is product behavior metrics, where you are looking to scale something like the following query:

        date_trunc('week', received_at) AS query_week,
        COUNT(DISTINCT customer_id) AS active_customer_count
FROM activities
WHERE received_at > '2023-01-01'

Running that query will find the matching set of activities, store the result in memory, then group by iterating over the set, and iterate over the order. The most common solution to this problem is using output caching. The problem with output caching is that it can only answer one question. The cached value for this query can only answer the same question the query answered. Thus, if you wanted to find the


A tale about (incomplete) upgrade from PostgreSQL 12 to 14
Posted by Hubert 'depesz' Lubaczewski on 2023-06-04 at 14:48
This might not interest many of you, but I recently heard about at least two people that stumbled upon the problems I did, so I figured I can write about problems we discovered, and how we solved them (or not). When we began our journey, the latest Pg was 14.x, that's why we're upgrading to … Continue reading "A tale about (incomplete) upgrade from PostgreSQL 12 to 14"

Setting Up a PostgreSQL 3-Node HA Cluster using pg_cirrus
Posted by Syed Salman Ahmed Bokhari in Stormatics on 2023-06-03 at 17:27

Introduction We are thrilled to announce the release of pg_cirrus! First of all, you might be wondering what “cirrus” means. The term refers to the thin and wispy clouds that are often seen at high altitudes. pg_cirrus is a simple and automated solution to deploy highly available 3-node PostgreSQL clusters with auto failover. It is […]

The post Setting Up a PostgreSQL 3-Node HA Cluster using pg_cirrus appeared first on Stormatics.

PostGIS Bundle 3.3.3 for Windows with MobilityDB
Posted by REGINA OBE in PostGIS on 2023-06-03 at 00:34

I recently released PostGIS 3.3.3. bundle for Windows which is available on application stackbuilder and OSGeo download site for PostgreSQL 11 - 15. If you are running PostgreSQL 12 or above, you get an additional bonus extension MobilityDB which is an extension that leverages PostGIS geometry and geography types and introduces several more spatial-temporal types and functions specifically targeted for managing objects in motion.

What kind of management, think of getting the average speed a train is moving at a segment in time or collisions in time, without any long SQL code. Just use a function on the trip path, and viola. Think about storing GPS data very compactly in a singe row /column with time and being able to ask very complex questions with very little SQL. True PostGIS can do some of this using geometry with Measure (geometryM) geometry types, but you have to deal with that craziness of converting M back to timestamps, which mobilitydb temporal types automatically encode as true PostgreSQL timestamp types.

Anita Graser, of QGIS and Moving Pandas fame, has written several posts about it such as: Visualizing Trajectories with QGIS and mobilitydb and Detecting close encounters using MobilityDB 1.0.

Continue reading "PostGIS Bundle 3.3.3 for Windows with MobilityDB"

Database change management vs performance
Posted by Michael Christofides on 2023-06-02 at 20:57

This month’s PGSQL Phriday topic is “database change management”, proposed by Dian Fay.

While I’ll try to make this topic about performance (as I always do) let me first embark on a quick trip down memory lane.

My background in database change management

Back in 2010, my first full-time job was at a company called Redgate, who’s flagship product at the time was SQL Compare. To this day, that tool allows folks to quickly compare SQL Server databases, review the differences, and generate a script to make one match the other. Fun fact, since then, an excellent former teammate of mine, Neil Anderson, has now built Postgres Compare. Anyway, back in 2010 when I joined Redgate, they also had a team working on the first version of SQL Source Control, which uses the same engine to sync databases with files in source control and vice versa — pretty cool! I was lucky enough to work with them, and then a few years later, found myself running a wonderful team tasked with building an equivalent tool, but for Oracle.

We also built tooling using these engines to enable continuous integration and even continuous deployment of databases. As such, I’ve seen everything from teams with no source control in place for databases, all the way through to folks who had the vast majority of the process automated!

What about this decade? And what about Postgres?

A lot of the Postgres shops I come across these days use Rails or Django, who already have in-built migrations support. Others use tools such as sqitch, liquibase, or flyway. But honestly, as I mentioned in our podcast episode on version control for databases, I still see a gap between how people manage their databases vs how they manage their application code.

As such, if you already have the following in place for your databases, I think you’re ahead of the curve:

  1. Version control, with the ability to branch

  2. Continuous integration, with unit tests

  3. Automated deployment pipeline being the primar


10 Requirements for Managing Database Changes
Posted by Ryan Booz on 2023-06-02 at 17:54
The host for PGSQL Phriday #009 is Dian Fay, who has asked us to discuss how we manage database changes. Oh my, Dian picked a topic that’s near and dear to me. In fact, one of my (still visible) T-SQL Tuesday blog posts was about this very topic… 7 years ago! Quick aside: For those ... Read more

PGSQL Phriday #009: On Rollback
Posted by Grant Fritchey on 2023-06-02 at 14:14

The invitation this month for #PGSqlPhriday comes from Dian Fay. The topic is pretty simple, database change management. Now, I may have, once or twice, spoken about database change management, database DevOps, automating deployments, and all that sort of thing. Maybe. Once or twice. OK. This is my topic. I’ve got some great examples on […]

The post PGSQL Phriday #009: On Rollback appeared first on Grant Fritchey.

Please welcome Pg_index_watch – a utility for dealing with index bloat on frequently updated tables.
Posted by Maxim Boguk in Data Egret on 2023-06-02 at 12:07

One of the big topics in Postgres is bloat. It is something that every DBA comes across and in fact we have a few good posts on how to work with bloat in our blog. 

Update-heavy tables is a special case of data that due to business requirements needs to be updated frequently to stay relevant. Autovacuum may be a good solution for table bloat (you can check this post describing how to do that), however it does not help with index bloat. 

Pg_index_watch resolves this issue by automatically rebuilding indexes when needed. 

How it works?

With the introduction of REINDEX CONCURRENTLY in PostgreSQL 12 there was finally a safe way to rebuild indexes without heavy locks. At the same time, it was still unclear how to identify a criteria based on which we can decide whether the index is bloated (and should be rebuilt) or not. 

What was missing is a simple and cheap to run statistical model that would allow us to estimate index bloat ratio without the requirement of reading and analyzing the whole index.

Now, PostgreSQL allows you to access the following:

1) number of rows in the index (in pg_class.reltuples for the index) 


2) index size.

Assuming that the ratio of index size to the number of entries is almost always constant (this is correct in 99.9% of cases), we can speculate that if, compared to its ideal state, the ratio has doubled it is most certain that the index was bloated x2.

Based on this assumption we developed a system, similar to an AUTOVACUUM, that automatically tracks level of bloated indices and rebuilds them when bloat goes over the threshold.

Read more on GitHub and try it out!

Let me know what you think.


Analytical Functions: Count of Distinct in Oracle vs. PostgreSQL
Posted by Deepak Mahto on 2023-06-02 at 08:33

This post is for subscribers

PGSQL Phriday #009: Three Big Ideas in Schema Evolution
Posted by Dian Fay on 2023-06-02 at 00:00

I've used several migration frameworks in my time. Most have been variations on a common theme dating back lo these past fifteen-twenty years: an ordered directory of SQL scripts with an in-database registry table recording those which have been executed. The good ones checksum each script and validate them every run to make sure nobody's trying to change the old files out from under you. But I've run into three so far, and used two in production, that do something different. Each revolves around a central idea that sets it apart and makes developing and deploying changes easier, faster, or better-organized than its competition -- provided you're able to work within the assumptions and constraints that idea implies.

sqitch: Orchestration

The first time I used sqitch, I screwed up by treating it like any other manager of an ordered directory of SQL scripts with fancier verification and release management capabilities. It does have those, but they weren't why I used sqitch the second and subsequent times.

sqitch wants you to treat your schema and the statements that define it as a supergraph of all your inter-database-object dependencies. Tables depend on types, on other tables via foreign keys, on functions with triggers or constraints; views depend on tables and functions; functions depend on tables, on other functions, on extensions. Each one, roughly, is a single named migration -- more on that in a bit.

So shipments depend on warehouses, since you have to have a source and a destination for the thing you're shipping, and warehouses depend on regions, because they exist at a physical address subject to various laws and business requirements. shipments also have no meaning independently from the thing being shipped, so in the case I'm filing the serial numbers from, that table also maintains a dependency on weather-stations. Both shipments and warehouses depend on the existence of the set_last_updated audit trigger function. The plan file looks like this:

trigger-set-updated-at 2020-03-19T17:

Logical Replication on Standbys in Postgres 16
Posted by Roberto Mello in Crunchy Data on 2023-06-01 at 13:00

Postgres 16 is hot off the press with the beta release last week. I am really excited about the new feature that allows logical replication from standbys, allowing users to:

  • create logical decoding from a read-only standby
  • reduce the workload on the primary server
  • have new ways to achieve high-availability for applications that require data synchronization across multiple systems or for auditing purposes

A second relevant and exciting new feature coming in 16 is that replication slots on a given standby persist the promotion of that standby to a primary. This means that in the event of primary server failure and promotion of a standby to primary, the replication slots will persist and the former-standby subscribers will not be affected.

These two together give PostgreSQL a huge boost in performance for big data operations. Applications moving data around in physical locations and folks doing data warehousing, analytics, data integration, and business intelligence. I’m going to walk through an example schema and database setup and offer some sample settings and code for creating logical replication from a standby.

Background on replication

At a high level PostgreSQL supports two main types of replication - streaming/physical and logical. The Write-Ahead Log can stream the entire set of physical files through a connection and represents the entire database on disk. Logical replication offers a more fine-grained approach, where you can specify individual database objects, such as tables or even specific rows, to be replicated to a remote server. You can read more about logical replication basics in Data To Go: Postgres Logical Replication.

A standby server in PostgreSQL is created by taking a base backup of the primary server and continuously applying all changes made on the primary. A hot standby is a standby server that can be promoted to become the primary server. PostgreSQL saves modifications in WAL (Write-Ahead Log) records and replicates them from the primary server to


SQL:2023 is out
Posted by Peter Eisentraut in EDB on 2023-06-01 at 04:00

The news today is that SQL:2023, the new version of the SQL standard, has been published by ISO.

Here are links to the parts on the ISO web site:

(Note, if you want to purchase a copy, you probably only want part 2, unless you intend to work in the particular areas covered by the other parts.)

Here is my previous article on what is new in SQL:2023, and here is the article on the status of SQL:2023 support in PostgreSQL.

Incidentally, the next SQL working group meeting is in less than two weeks’ time, and work on the next version will start then.

Data inconsistency in highly available PostgreSQL clusters
Posted by Umair Shahid in Stormatics on 2023-05-31 at 12:25

If nodes in a database cluster get out of sync, the inconsistency can cause data corruption. This blog describes what to watch out for.

The post Data inconsistency in highly available PostgreSQL clusters appeared first on Stormatics.

SVG Images from Postgres
Posted by Martin Davis in Crunchy Data on 2023-05-30 at 13:00

PostGIS excels at storing, manipulating and analyzing geospatial data. At some point it's usually desired to convert raw spatial data into a two-dimensional representation to utilize the integrative capabilities of the human visual cortex. In other words, to see things on a map.

PostGIS is a popular backend for mapping technology, so there are many options to choose from to create maps. Data can be rendered to a raster image using a web map server like GeoServer or MapServer; it can be converted to GeoJSON or vector tiles via servers such as pg_featureserv and pg_tileserv and then shipped to a Web browser for rendering by a library such as OpenLayers, MapLibre or Leaflet; or a GIS application such as QGIS can connect to the database and create richly-styled maps from spatial queries.

What these options have in common is that they require external tools which need to be installed, configured and maintained in a separate environment. This can introduce unwanted complexity to a geospatial architecture.

This post presents a simple way to generate maps entirely within the database, with no external infrastructure required.

SVG for the win

A great way to display vector data is to use the Scalable Vector Graphic (SVG) format. It provides rich functionality for displaying and styling geometric shapes. SVG is widely supported by web browsers and other tools.

By including CSS and Javascript it's possible to add advanced styling, custom popups, dynamic behaviour and interaction with other web page elements.

Introducing pg-svg

Generating SVG "by hand" is difficult. It requires detailed knowledge of the SVG specification, and constructing a complex text format in SQL is highly error-prone. While PostGIS has had the function ST_AsSVG for years, it only produces the SVG path data attribute value. Much more is required to create a fully-styled SVG document.

The PL/pgSQL library pg-svg solves this problem! It makes it easy to convert PostGIS data into styled SVG documents. The library provid


New Site, New Partman
Posted by Keith Fiske in Crunchy Data on 2023-05-30 at 12:10
Thanks to some help from my co-workers Elizabeth and David Christenson and their son, I got my site migrated from Wordpress to Hugo! Being a DBA, you’d think I wouldn’t mind having a database backing my website, but the simplicity of managing a static site like Hugo was much more appealing at this point. With a new site that’s far easier to manage, I’m hoping that will motivate me to get back to writing new content on a regular basis.

PostgreSQL for the SQL Server DBA: Transaction Isolation and Table Bloat
Posted by Ryan Booz on 2023-05-30 at 12:00
This is part of a series I began in 2018, documenting my journey from SQL Server to PostgreSQL. As someone that’s been through the process before, my hope is that these discussions will help mitigate some of the surprises and help you get up to speed faster. Today, I’m going to dig into how PostgreSQL ... Read more

ERROR: invalid byte sequence – Fix bad encoding in PostgreSQL
Posted by Laurenz Albe in Cybertec on 2023-05-30 at 08:11

It’s annoying to get error messages caused by encoding problems. But it is more annoying not to get error messages and end up with bad data in the database. I’ll show you how to fix bad encoding.

Some basic facts about encoding in PostgreSQL

Each PostgreSQL database has a server encoding. You can inspect (but not change) the PostgreSQL parameter server_encoding to see the encoding of the database you are connected to. You can choose the database encoding when you create a database, but you cannot change the server encoding after creating the database. This restriction is necessary because changing the server encoding will render string data corrupt.

Each database session has a client encoding, which you can view and change using the PostgreSQL parameter client_encoding. If your client uses the C client library libpq, you can determine the client encoding with the environment variable PGCLIENTENCODING. In addition, psql has the command \encoding to view and change the client encoding. psql tries to guess a good client encoding from the current locale. If the client encoding isn’t specified anywhere, PostgreSQL will set it to the server encoding.

PostgreSQL expects that character data sent by a client are encoded in the client encoding and converts them to the server encoding. Conversely, PostgreSQL converts query results to the client encoding.

PostgreSQL supports many encodings. All supported server encodings must be a superset of ASCII.

Preparing test data

The server encoding of my database is UTF8. We’ll use this simple table:

   id bigint PRIMARY KEY,
   t  text

Using Linux’ echo command, I create an SQL script in UTF-8 encoding:

echo -e "INSERT INTO texts VALUES (1, 'sch\xc3\xb6n');" \
    > insert_utf8.sql

Then I create a second script, with the same string in LATIN-1 (= ISO 8859-1) encoding:

echo -e "INSERT INTO texts VALUES (2, 'sch\xf6n');" \
    > insert_latin1.sql

Both strings contain the German letter “ö”. LATIN-1 encodes it as a


Carlos Chapi
Posted by Andreas 'ads' Scherbaum on 2023-05-29 at 14:00
PostgreSQL Person of the Week Interview with Carlos Chapi: My name is Carlos Chapi, I’m a computer engineer from Ecuador and, at the time of writing this, I’m 33 years old. I also recently got married so I’m getting used to that too.

Disaster Recovery for PostgreSQL on Kubernetes
Posted by Sergey Pronin in Percona on 2023-05-29 at 12:58
disaster recover for PostgreSQL on Kubernetes

Disaster recovery is not optional for businesses operating in the digital age. With the ever-increasing reliance on data, system outages or data loss can be catastrophic, causing significant business disruptions and financial losses.

With multi-cloud or multi-regional PostgreSQL deployments, the complexity of managing disaster recovery only amplifies. This is where the Percona Operators come in, providing a solution to streamline disaster recovery for PostgreSQL clusters running on Kubernetes. With the Percona Operators, businesses can manage multi-cloud or hybrid-cloud PostgreSQL deployments with ease, ensuring that critical data is always available and secure, no matter what happens.

In this article, you will learn how to set up disaster recovery with Percona Operator for PostgreSQL version 2.

Overview of the solution

Operators automate routine tasks and remove toil. For standby, Operator provides the following options:

  1. pgBackrest repo-based standby
  2. Streaming replication
  3. Combination of (1) and (2)

We will review the repo-based standby as the simplest one:

1. Two Kubernetes clusters in different regions, clouds, or running in hybrid mode (on-prem + cloud). One is Main, and the other is Disaster Recovery (DR).

2. In each cluster, there are the following components:

  1. Percona Operator
  2. PostgreSQL cluster
  3. pgBackrest
  4. pgBouncer

3. pgBackrest on the Main site streams backups and Write Ahead Logs (WALs) to the object storage.

4. pgBackrest on the DR site takes these backups and streams them to the standby cluster.

Configure main site

Use your favorite method to deploy the Operator from our documentation. Once installed, configure the Custom Resource manifest so that pgBackrest starts using the Object Storage of your choice. Skip this step if you already have it configured.

Configure the backups.pgbackrest.repos section by adding the necessary configuration. The below example is for Google Cloud Storage (GCS):


PostgreSQL compile times
Posted by Peter Eisentraut in EDB on 2023-05-29 at 04:00

What’s the fastest compiler for PostgreSQL? Let’s take a look.

OS Compiler time make -s
macOS 13 gcc-13 3:59.29
  gcc-12 3:42.19
  gcc-11 3:33.35
  clang-16 3:05.05
  clang-15 2:19.71
  clang-14 2:21.03
  clang-13 2:20.72
  Apple clang (14) 1:55.87
Ubuntu 22.04 gcc-12 (default) 2:57.87
  gcc-11 2:24.20
  gcc-10 2:18.28
  clang-15 2:21.62
  clang-14 2:24.23
  clang-13 2:25.68

Compilers keep getting slower, it appears! Maybe don’t use the latest one right away!

Also, gcc vs clang? Not sure, it depends.

The above uses the default optimization level -O2. Let’s see if we can squeeze out more:

Optimization Apple clang gcc-13
-O3 1:55.10 4:26.59
-O2 1:52.13

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months



  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.


Get in touch with the Planet PostgreSQL administrators at planet at