Latest Blog Posts

Please Welcome Prairie Postgres!
Posted by Henrietta Dombrovskaya on 2025-04-02 at 10:44

Dear Postgres Community! In this post, I want to formally introduce Prairie Postgres, a project that has kept me really busy for the past several months.

Recently, you’ve seen a lot of me online, talking about things I want to do and things I am already doing. Thinking about all these activities holistically, I can group them all around three common themes: Postgres “elementary education,” connecting with application developer communities, and connecting with educational institutions. The overall goal is broader promotion of knowledge about Postgres and its best practices.

Why do I think this is important? Postgres is undoubtedly the most beloved and most well-known open-source database, but too many people jump into the ocean of possibilities without being equipped with even basic knowledge. This creates a barrier for adoption and results in disappointments and not being able to get the most out of PostgreSQL.

The most important mission of Prairie Postgres is creating Postgres educational opportunities in the United States Midwest States. We want to focus on Midwest because other parts of the US have PostgreSQL User Groups and conferences, and the Midwest has a lot of Postgres but not enough community activities. We know that there are many people in the region working with Postgres, and we want to reach out to them. I often hear the sentiment that we see “all familiar faces” at Postgres events, and this includes not only the speakers, but the attendees as well, and we want to change that.

We are making our very first steps, but even these first steps wouldn’t be possible without tremendous help from many people. Firstly, I want to thank Pat Wright who gave me the idea to create an NFP, and then helped navigate the legal field – I can’t even imagine how long it would have taken us without his help.

Next I want to thank Dian Fay and Anna Bailliekova, Prairie Postgres co-founders, who jumped into this adventure and took upon themselves the most important task of never letting me make any dec

[...]

2025 Postgres Extensions Mini Summit Two
Posted by David Wheeler in Tembo on 2025-04-01 at 19:32

Last Wednesday, March 26, we hosted the second of five virtual Extension Mini-Summits in the lead up to the big one at the Postgres Development Conference (PGConf.dev) on May 13 in Montreal, Canada. Peter Eisentraut gave a very nice presentation on the history, design decisions, and problems solved by “Implementing an Extension Search Path”. That talk, plus another 10-15m of discussion, is now available for your viewing pleasure:

If you’d like to attend any of the next three Mini-Summits, join the Meetup!

Once again, with many thanks again to Floor Drees for the effort, here’s the transcript from the session.

Introduction

Floor Drees introduced the organizers:

Peter Eisentraut, contributor to PostgreSQL development since 1999, talked about implementing an extension search path.

The stream and the closed captions available for the recording are supported by PGConf.dev and their gold level sponsors, Google, AWS, Huawei, Microsoft, and EDB.

Implementing an extension search path

Peter: Thank you for having me!

I’m gonna talk about a current project by me and a couple of people I have worked with, and that will hopefully ship with Postgres 18 in a few months.

So, what do I know about extensions? I’m a Postgres core developer, but I’ve developed a few extensions in my time, here’s a list of extensions that I’ve built over the years.

[...]

Postgres on Kubernetes for the Reluctant DBA
Posted by Karen Jex in Crunchy Data on 2025-04-01 at 17:29

Slides and transcript from my talk, "Postgres on Kubernetes for the Reluctant DBA", at Data on Kubernetes Day Europe in London on 1 April 2025.



Introduction

This is me!
As you can see from the diagram representing my career so far (and as you already know if you've read my posts or watched my talks before), I have a database background.
I was a DBA for 20 years before I moved into database consultancy, and I’m now a senior solutions architect at Crunchy Data, working with customers to design, implement and manage their database environments, almost exclusively on Kubernetes.




Over the past few years, I’ve given a lot of talks about running Postgres on Kubernetes, and I work with a lot of customers who are at various different points on their databases on Kubernetes journey.
The questions from the audience, and hallway conversations at conferences are always interesting, and tend to come from one of 2 groups of people:

  1. People who are deep into running databases on kubernetes and are looking for answers to some tricky technical issue or architectural question.
  2. Self-proclaimed “old-school DBAs” who still aren’t convinced that running databases in containers is a good idea.

I prepared this talk especially for that 2nd group of people, so I hope some of you are in the audience today!
And don’t forget, as you saw on the previous slide, I come from an old-school DBAs background, so I’ve gone through that process.




To get an idea of who was in the audience, I asked the question
What’s your main area of responsibility in your current role? There was a reasonably even split betweeen:

  • Databases
  • System or Platform Administration
  • Devops or Automation
  • Development



The plan for the session was:

  1. Some Databases on Kubernetes background.
  2. Some audience participation.*
  3. A look at some of the main concerns that DBAs have about running DBs on Kubernetes.
  4. Some of the challenges you might encounter, and how you can overc
[...]

Swiss Database Synergy Day 2025: Oracle and PostgreSQL User Groups Unite
Posted by Laurenz Albe in Cybertec on 2025-04-01 at 05:00

Since I sang the praise of the 2024 Swiss PGDay, my people asked me to report from yet another small conference: the Swiss Database Synergy Day 2025, organized by the Swiss Oracle and PostgreSQL user groups.

A joint event with Oracle and PostgreSQL? You must be kidding!

It may indeed be surprising to hear of such an event. But then, PostgreSQL people have held talks at the yearly conference of the German Oracle User Group. The Austrian Oracle User Group held an event dedicated to the interplay of Oracle and PostgreSQL. Furthermore, previous editions of the Austrian PGDay were co-sponsored by an Oracle support company. And this list only covers events in my part of the world.

In addition, there is this nice reminder that the Polish Oracle User Group has sent my boss after he had talked there a while ago:

Polish Oracle Group picture

Still, there is a difference between PostgreSQL people talking at an Oracle conference and a joint event. I can understand why PostgreSQL people are interested in Oracle conferences: after all, a lot of our business comes from Oracle users who have fallen out of love with their database. But what motivates the Oracle users? Are they just curious and open-minded? Do they think the future belongs to PostgreSQL? Or is it just their way of showing dissatisfaction with Oracle?

Then there is the ambivalent word “synergy” in “Database Synergy Day”. Fundamentally, “synergy” is something positive, but in today's business speak it usually refers to fusioning two companies or departments. Rather than increasing productivity, that means that the two entities spend a year fighting for dominance until one of them ends up on top. All the while, the productivity is near zero. I sincerely hope that I am not about to witness that kind of synergy...

Well, I'm going to find out.

The speakers' dinner

It is customary to invite the speakers to a dinner as a “thank you” for their (unpaid) effort. This conference is no exception. As I said in my article on the Swiss PGDay 2024, a PostgreSQL feels a

[...]

Akshat Jaimini
Posted by Andreas 'ads' Scherbaum on 2025-03-31 at 14:00
PostgreSQL Person of the Week Interview with Akshat Jaimini: Hi! I am Akshat Jaimini, a final year undergraduate currently based out of Dehradun, India. I am working as an SDE Intern at a fintech firm called ION Group and will graduate soon in June, 2025.

How to fix Hibernate bug by conditional index
Posted by Pavel Stehule on 2025-03-30 at 05:59

Yesterday I found significant grow of seq read tuples. After some investigation I found query with strange predicate:

WHERE 1 = case when pb1_0.parent_id is not null then 0 end

It is really strange, and I had to ask, who wrote it.

The reply is - Hibernate. It is a transformation of predicate parent_id = (?) when the list of id is empty.

Unfortunately, PostgreSQL is not able to detect so this predicate is always false, and then the repeated execution ended in repeated full scans.

Fortunately, Postgres has simple workaround - conditional index

CREATE INDEX ON TABLE pb(id)
  WHERE 1 = case when pb1_0.parent_id is not null then 0 end

This index is always empty, and then index scan is fast.

This issue should be fixed in more recent versions of Hibernate where predicate 1=0 is generated instead.

Transparent Data Encryption for PostgreSQL Release Candidate is Here!
Posted by Jan Wieremjewicz in Percona on 2025-03-28 at 13:23
Transparent Data Encryption PostgreSQLPostgreSQL is, without a doubt, one of the most popular open source databases in the world. Why? Well, there are many reasons, but if I had to pick just one, I’d say it’s extensibility. PostgreSQL isn’t just a database; it’s an ecosystem of extensions that can transform it to tackle any upcoming challenges. By enabling […]

Converting JSON documents to relational tables
Posted by Regina Obe in PostGIS on 2025-03-28 at 04:03

JSON is one of the most popular ways of disseminating data between systems. It is probably the most common offered by webservices. PostgreSQL is a database perfectly suited for grabbing that data and transforming it into a more structured relational format. All this can be done directly in the database. We'll go over some ways to load and restructure json data.

Continue reading "Converting JSON documents to relational tables"

Saving The Output From psql
Posted by Dave Stokes on 2025-03-27 at 20:39
 
Occasionally, you will need to capture data when working with a database. You can cut-n-paste or use a shell command like script. In typical PostgreSQL fashion, psql has two ways to save your output.

The First Way

You must invoke psql with the --log-file- or -L option. The entire session will be recorded.

stoker@ThinkPad:~$ psql --log-file=/tmp/psqllog demo
Password for user stoker:
psql (17.4 (Ubuntu 17.4-1.pgdg24.04+2))
Type "help" for help.
demo=# \d ledger
                   Table "public.ledger"
     Column     |  Type   | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
 id             | integer |           |          |
 tx_id          | integer |           |          |
 total_register | money   |           |          |
 misc           | integer |           |          |
 status         | text    |           |          |
demo=# \q
stoker@ThinkPad:~$ cat /tmp/psqllog
                   Table "public.ledger"
     Column     |  Type   | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
 id             | integer |           |          |
 tx_id          | integer |           |          |
 total_register | money   |           |          |

The Second Way

The other choice is to use -o or --output= when you start psql. The differences are that the SQL command is not recorded

and the output does not show on your screen.

stoker@ThinkPad:~$ psql -o /tmp/out demo
Password for user stoker:
psql (17.4 (Ubuntu 17.4-1.pgdg24.04+2))
Type "help" for help.
demo=# \dt         # note the lack of output compared to the above
demo=# \q
stoker@ThinkPad:~$ cat /tmp/out
          List of relations
 Schema |   Name    | Type  | Owner
--------+-----------+-------+--------
 public | customers | table | stoker
 public | ex01      | table | stoker
 public | ex02      | table | stoker
 public | ex03      | table | stoker
[...]

Contributions for the week of 2025-03-10 (Week 10)
Posted by Pavlo Golub in postgres-contrib.org on 2025-03-27 at 12:51

On March 12, 2025 Elizabeth Christensen organized Postgres Meetup for All and the event theme was Postgres Schema Migrations. The following people contributed with their respective talks:


PGConf India 2025 took place on March 05 - 07, 2025 in Bangalore, India. The Organizing Team of PGConf India 2025 consists of the following people:

  • Pavan Deolasee
  • Ashish Mehra
  • Nikhil Sontakke
  • Aditya Raje
  • Hari Kiran

The Program Committee entrusted with selecting talks and presentations for PGConf India, 2025 consists of the following people:


PostgreSQL Person of the week: Doug Ortiz. Interview conducted by: Andreas Scherbaum.

March Meetup with Ryan Booz
Posted by Henrietta Dombrovskaya on 2025-03-27 at 11:45

On March 25, Chicago PUG members gathered one more time at the Training Center of the DRW Holdings office for our monthly meetup. Our speaker was Ryan Booz, an Advocate at Redgate; he presented the talk Intro to database migrations with Flyway.

Fun fact: Ryan was one of the first presenters at Chicago PUG at our new location at DRW in January 2023. It was only the second meetup at that location, and Ryan braved the Chicago winter to come and present here, which shows a real dedication from somebody who is not local to the Midwest 🙂

At that time, our Training Center was undergoing reconstruction, so we met in one of the conference rooms. Interestingly, Ryan’s talk in January 2023 was the earlier version of the same talk, and the topic is still more than relevant.

I would say that the March meetup was perfect in each of the aspects: the speaker, the presentation, the audience participation, the makeup of the audience, and the discussions we had after the presentation. My only oversight was that we ran out of pizza! That happen to me for only a two or three of times in all eight years I have been running this meetup. In my defence, that meetup was unusual in a positive way: the majority of those who RSVPed, actually showed up :).

Many thanks to Ryan and all Chicago PUG members! You are the best!

Semantic Search in PostgreSQL: An Overview
Posted by Hans-Juergen Schoenig in Cybertec on 2025-03-27 at 08:36

PostgreSQL offers advanced capabilities through extensions like pgvector, which enable semantic search at a level and quality never achieved before. Unlike traditional text search, which mostly relies on trivial string comparison, semantic search in PostgreSQL goes beyond keywords by understanding context and
meaning, enhancing relevance.

With extensions such as pgvector and pgai it is possible to combine the power of a relational database with cutting- edge features of modern AI models.

Benefits of Semantic Search in PostgreSQL

The benefits of semantic search are plentiful:

  • Improved search accuracy and user experience
  • Handling ambiguity
  • Generate content recommendations
  • Create personalized content

pgvector: Create a table for text data

After our brief introduction, it is time to take a look at the technical aspects
and see how this can be done for real. For the purpose of this demo we have used the excellent data set available here. It contains 3.6 million Amazon reviews, which are in an easy to use CSV format.

The first thing we have to do is to load the extension into our database and create a table that we can use to store the data we want to query:

cybertec=# CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION

cybertec=# CREATE TABLE t_document 
(
        id              serial          PRIMARY KEY, 
        polarity        float, 
        title           text, 
        body            text, 
        embedding       vector(384) 
);
CREATE TABLE

There are two important things to note here. The first observation is that the extension has been loaded into the database. Note that the name we have to use here is "vector", and not "pgvector" as one might expect. The second noteworthy thing is the fact that a new data type called "vector" is used here. However, the extension has more data types:

  • vector: up to 2,000 dimensions
  • halfvec: up to 4,000 dimensions
  • sparsevec: up to 1,000 non-zero elements

However, for our use case we need

[...]

pgstream v0.4.0: Postgres-to-Postgres replication, snapshots & transformations
Posted by Esther Minano in Xata on 2025-03-27 at 00:00
Learn how the latest features in pgstream refine Postgres replication with near real-time data capture, consistent snapshots, and column-level transformations.

TIL - Starting in read-only mode the easy way
Posted by Kaarel Moppel on 2025-03-26 at 22:00
Once a year or so Postgres manages to surprise me for some of those daily DB micro-operations, that come from muscle memory usually, without any braincells spent. Today to my delight I stumbled on one such again - for the task of starting a psql session in a safe read-only...

Creating Postgres roles with passwords stored in gopass
Posted by Thomas Klausner on 2025-03-26 at 15:00

We are currently setting up a staging server (called beta) for a project consisting of two sub-projects and a bunch of services in each project. Each service uses a distinct role (postgres-speech for user), so I needed to set up a lot of roles, where each role has a distinct username and password. We use gopass to manage our passwords (and related info like usernames). When deploying (via Ansible) we can extract the passwords from gopass and inject them into the environment of the containers running the services. Instead of manually creating a password, storing it in gopass and creating a role in our Postgres cluster, I wrote a short script to automate this:

The script

#!/usr/bin/perl

use v5.36;

open(my $fh, ">", "create_roles.sql") || die $!;

my @stages = qw(beta production);
my @services = qw(accounts-admin accounts-mailer accounts-sso winxle-admin winxle-ga-logger winxle-merchant winxle-newsletter winxle-quickdatacollectionform winxle-scripts winxle-user winxle-winner);

for my $stage (@stages) {
    for my $service (@services) {
        $service =~ /^(\w+)-/;
        my $project = $1;
        my $gopasspath = join('/','validad', $project, $stage, $service, 'database');
        my $snake_service = $service;
        $snake_service=~s/-/_/g;

        my $rv = `gopass generate -f -p $gopasspath 32`;
        my @lines = split(/\n/, $rv);
        my $pw = $lines[3];

        my $role = sprintf('connect_%s_%s', $stage, $snake_service);

        my $create_role = sprintf("CREATE ROLE %s IN ROLE %s LOGIN PASSWORD '%s';", $role, $project, $pw);
        say $fh $create_role;

        my $add_username_cmd = qq{echo "\nuser: $role" | gopass insert -a $gopasspath};
        system($add_username_cmd);
    }
}
close $fh;

Walk through

I open a file to store the SQL statements (create_roles.sql) and define my stages and my services and walk through all of them in a double for loop.

I extract the $project name from the $service name (i.e. accounts or winxle) and use that to create the path to the secret in

[...]

15 years of Prague PostgreSQL Developer Day
Posted by Tomas Vondra on 2025-03-26 at 10:00

It’s been a couple weeks since P2D2 (Prague PostgreSQL Developer Day) 2025. We’ve been busy with the various tiny tasks that need to happen after the conference - processing feedback, paying invoices, and so on. But it’s also a good opportunity to look back - I realized this was the 15th year of the event I’ve helped to organize, so let me share some of that experience.

Prague PostgreSQL Developer Day 2024 / talks

Postgres Security Checklist from the Center for Internet Security
Posted by Elizabeth Garrett Christensen in Crunchy Data on 2025-03-25 at 15:00

The Center for Internet Security (CIS) releases security benchmarks to cover a wide variety of infrastructure used in modern applications, including databases, operating systems, cloud services, containerized services, and even networking. Since 2016 Crunchy Data has collaborated with CIS to provide this security resource for those deploying Postgres. The output of this collaboration is a checklist for folks to follow and improve the security posture of Postgres deployments.

The PostgreSQL CIS Benchmark™ for PostgreSQL 17 was just recently released.

The Center for Internet Security

The Center for Internet Security (CIS) is a nonprofit organization that collaborates with government and commercial entities to develop best practices for securing IT systems and data. CIS Benchmarks are community driven and help provide configuration recommendations in the form of security checklists. CIS allows public contributions, reviews, and an open discussion forum on the benchmarks to make sure they meet broader community standards.

The CIS Benchmark for Postgres is a free, community supported, security checklist for Postgres.

Getting started with the Postgres benchmark

The CIS Benchmark for Postgres is a freely available pdf for non-commercial use with recommendations alongside Postgres configurations. The pdf is 200+ pages of descriptions, rational, and sample code to verify Postgres configurations.

In addition to manual verification, to standardize on this benchmark, teams incorporate these settings into their infrastructure deployment tools. Using infrastructure-as-code tools with the benchmarks ensure deployments across an organization meet these security specifications.

For commercial use of CIS Benchmarks, CIS has membership and tools to automatically run the benchmarks.

What is in the CIS Postgres benchmark security checklist?

The benchmark covers a variety of topics for Postgres deployment and configurations, including:

  • Postgres install and file permiss
[...]

Mini Summit 2: Extension Search Path Patch
Posted by David Wheeler in Tembo on 2025-03-24 at 21:14
Orange card with large black text reading “Implementing an Extension Search Patch”. Smaller text below reads “Peter Eisentraut, EDB” and “03.26.2025”. A photo of Peter speaking into a mic at a conference appears on the right.

This Wednesday, March 26 at noon America/New_York (16:00 UTC), Peter Eisentraut has graciously agreed to give a talk at the Extension Mini Summit #2 on the extension search path patch he recently committed to PostgreSQL. I’m personally stoked for this topic, as freeing extensions from the legacy of a single directory opens up a number of new patterns for packaging, installation, and testing extensions. Hit the Meetup to register for this live video conference, and to brainstorm novel uses for this new feature, expected to debut in PostgreSQL 18.

2025 Postgres Extensions Mini Summit One
Posted by David Wheeler in Tembo on 2025-03-24 at 20:46

Back on March 12, we hosted the first in a series of PostgreSQL Extensions Mini Summits leading up to the Extension Ecosystem Summit at PGConf.dev on May 13. I once again inaugurated the series with a short talk on the State of the Extension Ecosystem. The talk was followed by 15 minutes or so of discussion. Here are the relevant links:

And now, with many thanks to Floor Drees for the effort, the transcript from the session.

Introduction

Floor Drees introduced the organizers:

David presented a State of the Extension Ecosystem at this first event, and shared some updates from PGXN land.

The stream and the closed captions available for the recording are supported by PGConf.dev and their gold level sponsors, Google, AWS, Huawei, Microsoft, and EDB.

State of the Extensions Ecosystem

So I wanted to give a brief update on the state of the Postgres extension ecosystem, the past, present, and future. Let’s give a brie history; it’s quite long, actually.

There were originally two approaches back in the day. You could use shared preload libraries to have it preload dynamic shareable libraries into the main process. And then you could do pure SQL stuff using, including procedural languages like PL/Perl, PL/Tcl, and such.

And there were a few intrepid early adopters, including PostGIS, BioPostgres, PL/R, PL/Proxy, and pgTAP, who all made it work. Beginning of Postgres 9.1 Dimitri Fontaine added support for explicit support for extensions in the Postgres core itself.

[...]

Shruthi K C
Posted by Andreas 'ads' Scherbaum on 2025-03-24 at 14:00
PostgreSQL Person of the Week Interview with Shruthi K C: My name is Shruthi K C, and I live in Bangalore, Karnataka, India. I work as a Database Developer at EnterpriseDB, with over 15 years of experience in databases. The last 4 years I have been primarily focused on PostgreSQL.

Why PostgreSQL needs a better API for alternative table engines?
Posted by Alexander Korotkov on 2025-03-24 at 00:00

For a long time now, PostgreSQL has had an extensible Index Access Method API (called AM), which has stood the test of time and enabled numerous robust extensions to provide their own index types. For example: rum, pgvector, bloom, zombodb and others. PostgreSQL 12 introduced the Table AM API, promising equivalent flexibility for table access methods.

Despite PostgreSQL's Table AM API being available since version 12 and ongoing criticisms of its built-in storage engine — particularly the MVCC model ([1], [2], [3]) — it remains surprising that no fully featured transactional storage engine has yet emerged purely as an extension.

Since the table AM and index AM APIs are tightly coupled, this is an issue for both implementations.

The features most in demand for alternative PostgreSQL table engines are:

  1. Alternative MVCC implementations, e.g.., UNDO-log-based storages. The motivation to provide this is well-discussed in Uber blog post, Andy Pavlo blog post, and many other sources.
  2. Non-heap-like storages. For example, in index-organized tables, the index is not an optional addition to the table that speeds up requests but a necessary layer the table storage uses internally. Consequently, the table tuple is a part of a complex data structure and can’t be addressed by a fixed-length address like page number and offset number. It requires to be addressed by variable length identifier like index key.

The API extension to provide #2 is more or less understandable. This could be done by replacing ctid with an arbitrary sequence of bytes in all the APIs. However, #1 seems rather complex and requires much clarification.

As an example that illustrates the motivation for the changes to the table and index AM API comes OrioleDB. It is an extension providing a table access method we developed to address many well-known shortcomings of the built-in storage engine. However, OrioleDB is not yet a drop-in extension; it requires several patches to the PostgreSQL Core.

Besides these two things, wh

[...]

PgPedia Week, 2025-03-23
Posted by Ian Barwick on 2025-03-23 at 21:38

With CommitFest 52 (2025-03) - the final CommitFest in the PostgreSQL 18 development cycle - entering its final stages, there has been a burst of activity, and the PostgreSQL 18 code freeze has been announced for April 8th .

PostgreSQL 18 changes this week EXPLAIN hooks for use by extensions added extension_control_path new GUC enabling extension control files to be stored in a custom location io_combine_limit maximum range extended to 1MB / 128 block io_max_combine_limit new GUC controlling the largest I/O size in operations that combine I/O maintenance_io_concurrency default value increased to 16 max_active_replication_origins new GUC controlling the maximum number of replication origins query_id_squash_values new GUC to control squashing of lists of constants in query jumbling vacuum_truncate new GUC controlling truncation of pages at the end of a table pg_combinebackup option -k / --link added pg_createsubscriber option -R / --remove added psql \sendpipeline slash command added vacuumdb --missing-stats-only option added for analyzing only relations missing stats

more...

SCaLE 22x, live streams and Row-Level Security Sucks
Posted by Jimmy Angelakos on 2025-03-21 at 19:30

Obligatory hallway arch photo from SCaLE22x

The 22nd edition of the Southern California Linux Expo, or SCaLE 22x, took place from March 6-9, 2025, in Pasadena, California. It was a fantastic event packed with insightful talks, community spirit, and legendary figures from the computing world.

One of the highlights of SCaLE 22x was getting the chance to attend talks by two icons: Leslie Lamport and Jon "maddog" Hall. Leslie Lamport, whom many know for Paxos, TLA+, and generally being the "father of distributed computing", delivered the closing keynote titled "Coding isn't Programming". Meant to shake up developers, it was thought-provoking and engaging arguments. It was a rare privilege to hear directly from someone whose work has influenced so much of modern computing.

Earlier, Jon "maddog" Hall's talk "The JOY of Free Software: It should be fun to fish" was a great reminder of why we do what we do in the open-source world and how we should not lose track of how fun this journey of creation was. maddog's passion for free software and its ethical implications is always inspiring, and I've mentioned before on this blog how his example has guided my career choices.

PostgreSQL @ SCaLE 22x

Postgres again had a strong presence at SCaLE with a dedicated track of talks (thanks PgUS for the booth in the Expo hall), and I had the pleasure of being part of several related sessions:

  • On Thursday, March 6th, I participated in the PostgreSQL Ask Me Anything session alongside some incredible PostgreSQL experts. The discussion touched on internals, performance, best practices, and the future of Postgres. It's always great to interact with the community and share insights from our experiences.

  • On Friday, March 7th, I delivered my talk titled "Row-Level Security Sucks. Can We Make It Usable?". It was a deep dive into PostgreSQL's Row-Level Security (RLS), highlighting its strengths and shortcomings, and exploring ways to make it more accessible to empower developers in real-world applications.

[...]

Mentoring Applications and Hacking Workshop for April 2025
Posted by Robert Haas in EDB on 2025-03-19 at 15:04
Here are a few mentoring-related updates.Read more »

Postgres Troubleshooting: Fixing Duplicate Primary Key Rows
Posted by Greg Sabino Mullane in Crunchy Data on 2025-03-19 at 14:00

Someone recently asked on the Postgres mailing lists about how to remove unwanted duplicate rows from their table. They are “unwanted” in that sense that the same value appears more than once in a column designated as a primary key. We’ve seen an uptick in this problem since glibc was kind enough to change the way they sorted things. This can lead to invalid indexes when one upgrades their OS and modifies the underlying glibc library.

One of the main effects of a corrupted unique index is allowing rows to get added which should be caught by the primary key. In other words, for a table with a primary key on a column named “id”, you might observe things like this:

-- Can you spot the problem?
SELECT id FROM mytable ORDER BY id LIMIT 5;

 id
----
  1
  2
  3
  3
  4
(5 rows)

Without knowing anything else about the problem, what’s the first step to solving it? If you said take a backup, you are correct! Make a fresh backup anytime you think something is wrong with your database, or before any attempt to fix such a problem.

Aside: can we simply reindex? No - a unique index cannot be created (or recreated) as long as there are duplicate rows. Postgres will simply refuse to do so, as it violates the uniqueness we are trying to enforce. So we must delete the rows from the table.

Here is one careful recipe for fixing the problem of duplicated primary key entries in your Postgres table. Obviously you need to adjust this as needed for your situation, but just walk through it slowly and make sure you understand each step. Especially if this happens in production (spoiler: it almost always happens in production).

1. Debugging aids

The first thing we want to do may look a little strange:

-- Encourage not using indexes:
set enable_indexscan = 0;
set enable_bitmapscan = 0;
set enable_indexonlyscan = 0;

Because bad indexes are the primary way bad rows like this get into our database, we cannot trust them. These low-level debug aids are a way of telling the Postgres planner to prioritize other ways

[...]

PostgreSQL's COPY and \COPY
Posted by Dave Stokes on 2025-03-19 at 13:59
PostgreSQL is equivalent to a Swiss Army Knife in the database world. There are things in PostgreSQL that are very simple to use, while in another database, they take many more steps to accomplish. But sometimes, the knife has too many blades, which can cause confusion. This is one of those cases.

COPY and \COPY

I needed a quick dump of a table as part of an ETL (Extract, Transform, and Load - official definition, Exorcism, Trauma, and Lost-sleep - unofficially) to port some data. The COPY command is a quick way to output and write the data to a file.  The two copy commands below illustrate adding a delimiter and then writing the output to a table.

demo=# select * from z;
 a | b  |  c  
---+----+-----
 1 | 10 | 100
 2 | 20 | 200
 3 | 30 | 300
 4 | 40 | 400
(4 rows)
demo=# copy z to stdout (DELIMITER ',');
1,10,100
2,20,200
3,30,300
4,40,400
demo=# copy z to '/tmp/z' (DELIMITER ',');
COPY 4
demo=# 

The trick is that you must REMEMBER that the PostgreSQL server is writing the file and will need permission to write in the desired directory. If you try to write your current directory, you will probably see this:

demo=# copy z to 'z' (DELIMITER ',');
ERROR:  relative path not allowed for COPY to file
demo=# 

ETL

The table I needed to 'fix' as part of the ETL process was edited and was now ready to load. I created a new table.

demo=# create table zz (a1 int, b1 int, c1 int);
CREATE TABLE

The tricky part is that you can not use COPY to load the data. You need to use psql's \COPY! Consider \COPY a wrapper on the client side around the server's COPY command.

demo-# \copy zz from '/tmp/z/' (delimiter ',');
COPY 4
demo-# 

I find the COPY - /COPY pairing very handy.

More?


I highly recommend Wager Bianchi's COPY vs. \copy in PostgreSQL and their Security Implications for DBAs for those seeking more information on this subject and its healthy
[...]

How to Use regexp_matches and regexp_match in PostgreSQL
Posted by Colin Copeland on 2025-03-19 at 08:00

Introduction

regexp_matches() and regexp_match() are two similar string functions that support regular expression matching directly in the PostgreSQL database. regexp_matches() was added in PostgreSQL 8.3, and regexp_match() was added in PostgreSQL 10 (keep reading to see how ChatGPT struggled to answer this question).

As noted in the PostgreSQL 10 release notes, regexp_match():

"... only returns results from the first match so it does not need to return a set, making it easier to use for simple cases."

The differences between these two functions are nuanced, however, and they are easiest to understand with some examples.

Test data

You can create a short table with some test data in a local PostgreSQL database like so:

$ psql
psql (16.1)
Type "help" for help.
tobias=# CREATE TABLE patterns (value text);
CREATE TABLE
tobias=# INSERT INTO patterns VALUES ('foo'), ('bar'), ('foobar'), ('foo1barfoo2bar');
INSERT 0 4
tobias=# SELECT * FROM patterns;
    value
----------------
foo
bar
foobar
foo1barfoo2bar
(4 rows)

regexp_matches()

Since regexp_matches() was added first to PostgreSQL, let's see how it works and perhaps why its companion regexp_match() was added later on.

tobias=# SELECT
tobias-#     value
tobias-#     , regexp_matches(value, 'foo\d?', 'g')
tobias-#     , regexp_matches(value, 'foo\d?', 'g') is null AS is_null
tobias-# FROM patterns;
    value      | regexp_matches | is_null
----------------+----------------+---------
foo            | {foo}          | f
foobar         | {foo}          | f
foo1barfoo2bar | {foo1}         | f
foo1barfoo2bar | {foo2}         | f
(4 rows)

Observations:

  1. The example uses the g flag, since according to the documentation, "if you only want the first match, it's easier and more efficient to use regexp_match()."
  2. One row each is returned for the value
[...]

PGDay UK 2025 - Call for Papers and Call for Sponsors Open!
Posted by Chris Ellis on 2025-03-18 at 14:30

We're happy to announce that the Call for Papers and Call for Sponsors is open for PGDay UK 2025!

Call for Papers is open from today until May 12, 2025 at 23:59:59 in London, UK. Selected speakers will be notified before May 26, 2025. We accept proposals from both established and new speakers on a range of PostgreSQL related topics. You don't need to have the talk written beforehand, we just need an abstract of your talk, so don't hold back and tell your friends. Find out more: https://2025.pgday.uk/call-for-papers/ .

Call for Sponsors is open from today until 2025-09-02. Our Benefactor level is limited to 8 sponsors and will sell out, so best grab your company's spot sooner rather than later. Find out more here: https://2025.pgday.uk/become-sponsor/ .

Citus: The Misunderstood Postgres Extension
Posted by Craig Kerstiens in Crunchy Data on 2025-03-18 at 13:50

Citus is in a small class of the most advanced Postgres extensions that exist. While there are many Postgres extensions out there, few have as many hooks into Postgres or change the storage and query behavior in such a dramatic way. Most that come to Citus have very wrong assumptions. Citus turns Postgres into a sharded, distributed, horizontally scalable database (that's a mouthful), but it does so for very specific purposes.

Citus, in general, is fit for these type of applications and only these type:

  • Sharding a multitenant application: a SaaS/B2B style app, where data is never joined between customers
  • Low user facing, high data volume analytics: specifically where the dashboards are hand-curated with minimal levers-and-knobs for the user to change (i.e. customer cannot generate unknown queries)

Mistaken use cases for Citus that are not a great fit:

  • Lack of rigid control over queries sent to database
  • Geographic residency goals or requirements; Citus is distributed for scale, not distributed for edge.

Let's look closer at each of the two use cases that Citus is a good fit for.

Multitenant/SaaS applications

Multitenant or SaaS applications typically follow a pattern: 1) tenant data is siloed and does not intermingle with any other tenant's data, and 2) a "tenant" is a larger entity like a "team" or "organization".

An example of this could be Salesforce. Within Salesforce you have the notion of an organization, and the organization has accounts, customers, and opportunities within them. When you create a Salesforce account, all of your customers and opportunities are solely yours — data is not shared with other Salesforce organizations.

For these types of applications, Citus distributes the data for each tenant into a shard. Citus handles the splitting of data by creating placement groups that know they are grouped together, and placing the data within shards on specific nodes. A physical node may contain multiple shards. Let me restate that to under

[...]

Debugging PostgreSQL more easily
Posted by Hans-Juergen Schoenig in Cybertec on 2025-03-18 at 06:00

PostgreSQL is the foundation for thousands and thousands of applications. The system has long proven its worth and is operating reliably. However, the question people often ask is: What is the best way to actually debug a database application? 

For a couple of years now, I have been using a simple method that greatly speeds up the debugging process of typical applications.

Using inheritance to store data

One of the core features of PostgreSQL, which has been around for decades already, is the idea of "inheritance". So what does that mean, exactly? Simply speaking, tables can inherit columns from each other. A child table will simply have all the columns of the parent table, plus its own additional ones.

But what does this mean in real life, and what does it have to do with debugging? Let us take a closer look:

CREATE TABLE t_global 
(
        id              serial, 
        tstamp          timestamptz     DEFAULT now() 
);
CREATE TABLE t_product 
(
        name            text, 
        price           numeric
) INHERITS (t_global);
INSERT INTO t_product (name, price) 
        VALUES  ('Shoes', 113.98), 
                ('Sausage', 4.58);

The t_global table contains a sequence including a timestamp. Those two columns are passed on to t_product. What this means is that the sequence now also applies to the child table.

Here is a second table:

CREATE TABLE t_country 
(       
        country_name    text
) INHERITS (t_global);
INSERT INTO t_country (country_name) 
        VALUES  ('Austria'), 
                ('Germany'), 
                ('Japan');

We are now using the same parent table for the country table. What does it matter? What we have just produced is two things: First of all, there is ONE global sequence for all tables, which means that all IDs across the entirety of the systems are unique (we will need this later). Also: All tables have a timestamp that has an identical default value. 

Using the parent table for debugging applications

Let us now imagine for

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.