Data checksums are one of those Postgres features that, when they are doing their job, are easily forgotten. They sit quietly in the header of every data page as a small integer fingerprint, forever waiting to thwart the threat of cosmic rays or errant hardware failures. Most clusters run from cradle to grave and never trip a single one.For years, that decision was etched in stone at the time of database initialization. It wasn't until version 12 that Postgres introduced the pg_checksums utility to change it. And even then, doing so is a fully offline affair, grinding through every page on disk and incurring a long outage window.That's a fairly painful ordeal for a basic safeguard that wasn't even enabled by default until version 18. So why go through all the trouble in the first place? Do we really need data checksums in our Postgres cluster? The short answer is "yes". The longer answer explains why Postgres 19 continues to improve the checksum system by adding an online conversion capability.
This blog is going to show you how to set up your own RAG Server on pgEdge Cloud. The Cloud UI makes this so easy it is almost insulting - a few clicks and you are done - so I am going to show you the harder and more interesting path instead: the Cloud API. Everything below is a real call you can adapt. Replace anything in with your own values, and keep your API keys out of your shell history.Click on "Services" under your database, click on "Add RAG Server" and enter your config. see? sooo easy. Soooo boring. Lets use the API.
First things first: I need a dataset. Because this is my blog, I get to choose the use-case, so I am inflicting my personal interests on you. I am a huge Tabletop RPG nerd (yes, like Dungeons & Dragons), and my favourite system is GURPS 4th Edition (Generic Universal Roleplaying System) by Steve Jackson Games (No, you don’t have to care about this). Although mechanically simple, this is a huge sprawling game system that just grows and grows, because it is generic and universal. You can run a game in any setting or genre, so the amount of content that is available has become massive. I have around 50 books… thankfully in electronic format.Finding the right rule at the table during play can sometimes be quite an exercise, let alone adding up all the modifiers, and applying the rule. So I did what any reasonable person with a distributed Postgres habit would do. I crammed all fifty books into my own personal RAG server, so I can ask it a plain-English question and get a cited answer back in seconds. Then when I and my degenerate friends gather every Sunday, away from the wives, clutching our beer, dice, pencils, and pizza, I can impose rapid smack-downs upon the Rules Lawyers who try to argue with me.Anyway… enough about my sample use-case, let's get into it.
If you are of a certain age, the words 38911 BASIC BYTES FREE will do something to you that no amount of therapy can undo. You remember the blue screen. You remember typing in three pages of a listing from a magazine, getting ?SYNTAX ERROR IN 2340, and not knowing which of the three pages contained the typo. You remember that the disk drive was device 8, and that it was slower than continental drift.
I have some news. All of that now runs inside PostgreSQL.
PL/CBMBASIC is a procedural language extension that executes function bodies on Commodore 64 BASIC V2. Not a lookalike, not a tribute act: the actual Microsoft/Commodore interpreter from 1982, by way of Michael Steil's cbmbasic project, which statically recompiled the 6502 ROM into C. That C is compiled straight into the extension's shared library, so the interpreter lives inside your backend process. Every function call is an in-memory power cycle: zero the 64KB RAM array, reset the CPU registers, and re-enter the ROM at $E394. The whole ceremony costs about 15 to 20 microseconds, which is roughly a thousand times faster than the hardware ever managed, and quick enough to call per row over a large table without feeling guilty.
CREATE EXTENSION plcbmbasic;
CREATE FUNCTION hello(who text) RETURNS text AS $$
10 PRINT "HELLO, ";WHO$;"!"
$$ LANGUAGE plcbmbasic;
SELECT hello('WORLD'); -- HELLO, WORLD!
Yes, those are line numbers. Yes, they are mandatory. User code starts at line 10, like nature intended, because lines 0 to 9 are reserved: the extension injects your function arguments there as ordinary BASIC assignments before your code runs. A text parameter named who arrives as WHO$, a smallint named lives becomes a genuine 16-bit LIVES%, and everything numeric otherwise lands in a 40-bit CBM float, all nine glorious significant digits of it.
Anyone who programmed a C64 for more than an hour discovered that you could not have a variable called TOTAL. The tokeniser crunched keyword
Yesterday, I had the pleasure of presenting at the Postgres User Group Estonia, and that was a delightful experience! Many thanks to Ervin Weber, who literally spent three years trying to make it happen. I was happy to give back to one of my favorite places in the world – the city of Tallinn.
I was a little bit hesitant when Ervin indicated his preference to listen to my pg_acm talk. I thought that this talk was often viewed as “too specialized”, “niche,” and not interesting enough to people who are “not very much into Postgres.” And I am so glad I ended up giving this talk to this particular group!
I have probably never heard such extensive and thoughtful feedback! Multiple people approached me during the break, saying they had run into all the problems I described, that they understand the challenges, and that they would love to give it a try! (and now I need to make sure all the bugs in the open-source version are fixed! – Watch for updates on this GitHub repo).
That was a slightly extended version of the talk I gave at PG DATA, and now that this talk has been accepted for PG.Conf EU, I need to extend it a little more, and I know what I will add and how I will incorporate the feedback I received yesterday! It always surprises me that application developers “get it” right away, unlike many DBAs, and understand the advantages of that approach. Each question I received yesterday was clear evidence that people had thought about the problems I was trying to solve and were happy to hear that a solution is available.
Thank you, Tallinn! We will do it again
For PostgreSQL administrators, DBAs, SREs, and platform teams, understanding how backup data moves through a system is just as important as knowing when a backup completed successfully. Questions about repository layout, WAL handling, metadata, integrity, and recovery usually surface when troubleshooting, validating a backup strategy, or preparing for recovery.
Rather than focusing on commands or configuration, it follows the lifecycle of a backup through pg_hardstorage. Beginning with PostgreSQL, the guide walks through how base backups and WAL are captured, how data enters the repository, how chunks, manifests, and metadata are organized, and how those components come together to reconstruct a database during recovery.
The guide also explains the engineering decisions that shape the repository itself. It explores topics such as content-addressed storage, chunking and deduplication, manifest design, metadata management, repository layout, integrity verification, corruption handling, crash safety, garbage collection, and the restore workflow, showing how these components work together rather than as isolated features.
If you like to explore the implementation alongside the architecture, the GitHub repository contains the project source code, documentation, and ongoing development of pg_hardstorage.
GitHub Repository: https://github.com/cybertec-postgresql/pg_hardstorage
Whether you are reviewing the repository design, evaluating the storage architecture, or simply interested in how pg_hardstorage approaches backup and recovery, the complete Storage and Recovery Guide provides a detailed walkthrough of the concepts, design decisions, and recovery flow behind the project.
Storage and Recovery Guide is accessible under resources section: https://www.cybertec-postgresql.com/en/products/pg-hardstorage
The post Following a Backup from PostgreSQL to Recovery usin
[...]After more than 20 years working with PostgreSQL, I keep seeing the same problems surface at the worst possible times - bloat that sneaks up on you, replication slots quietly holding back WAL, transaction ID wraparound that nobody caught in time, backups that silently stopped working weeks ago. There are also data and catalog corruption issues like TOAST table corruption or a mismatch between heap state and VM state causing problems with vacuum operations. What I always wanted was a single tool I could point at any PostgreSQL instance and get a clear, actionable picture of its health. So I built one.pg-healthcheck is an open source utility written in Go that runs 180+ checks across 14 groups, querying live PostgreSQL system catalog views directly… no estimates, no simulated data. It works against single PostgreSQL instances as well as pgEdge multi-node Spock clusters, and gives you either coloured terminal output or structured JSON you can feed into a monitoring pipeline.
Hot off the press: pgcopydb v0.18 is out!
It’s the biggest release the project has had — 88 commits since v0.17, which shipped in August 2024. I took a break from my Open Source responsibilities for a while, because I was lacking employer support to make it happen.
pgcopydb copies a PostgreSQL database to another PostgreSQL server, as fast as possible when physical file copy isn’t available. It parallelises the COPY across all tables simultaneously, builds indexes in parallel after data is loaded, and supports Change Data Capture via logical replication for minimal-downtime migrations. It is designed to be restartable: state is tracked in a local SQLite catalog so an interrupted run can resume where it left off.
v0.18 brings compatibility with PostgreSQL 16, 17, and 18; a pgoutput-default CDC engine with significant reliability and performance improvements; regular-expression-based filtering; Citus-to-Citus migration support; and 24 bug fixes.
The source code is available on Codeberg.
The extension is also available on PGXN.
The extension is also availabe through the PostgreSQL rpm packages.
This minor update solves a problem in the deinstallation script.
Due to the changes in version 1.0 with installation of objects in a schema of its own, the generated uninstall script did not work anymore.
The deinstallation does now also work when the extension is installed in a different schema name.
PostgreSQL community images address a real gap in how a Kubernetes database operator earns your trust. Running a database operator on Kubernetes means trusting two things: the code, and the container images the operator pulls. The code is on GitHub, easy to inspect, easy to fork. The container images, the registry that hosts them, and the license that governs them all sit with the vendor, and any of those three can change without the source repository changing at all. Starting with Percona Operator for PostgreSQL 3.0.0, you can run the operator against community images you build yourself from the official PostgreSQL packages on download.postgresql.org, in a registry you control.
In this post:
Open source has changed in the last few years, and not always for the better. Companies have learned that you can keep a project’s source code fully open and still capture most of the lock-in by quietly closing the parts that matter in production: the release artifacts, the container images, the supported OS list, the certified Kubernetes distributions, the marketplace listings.
You can have a fully community CNCF proj
[...]

© Laurenz Albe 2026 (see here for more background)
Recently, I helped a customer investigate database problems. It turned out that these problems could be traced back to too many tables in the database. Since this may come as a surprise to many users, I thought it worth the while to write about it.
There were two problems that sounded like they might or might not be related to each other:
The first step in investigating OOM problems is always to disable memory overcommit. I originally suspected other software running on the machine to cause the memory shortage, but memory context dumps showed that PostgreSQL was at fault.
There are several typical causes for high memory usage in PostgreSQL:
work_mem
However, it turned out that in this case, it was something else that hogged the memory.
After disabling memory overcommit, we got memory context dumps in the log file, as well as log entries from autovacuum workers that failed to fork because there was too little memory. Initially, the memory context dumps did not show anything interesting: they were from victims of memory starvation rather than from the culprits. That is also why I originally suspected causes external to PostgreSQL. But then we got some memory context dumps that looked as follows:
TopMemoryContext: 355016[...]
The most recent meeting of ISO/IEC JTC1 SC32 WG3 “Database Languages” took place from the 15th to the 19th of June 2026 in Stockholm. “WG3”, as we call it, works on standardizing the database languages SQL and GQL. In that meeting, a number of proposals that are of interest to SQL and PostgreSQL were accepted, which I want to report about here.
The meeting code of this meeting was “BMA”, which is the code for a small airport in Stockholm. All in-person WG3 meetings are named after a nearby airport. In this case, the code “ARN” for Stockholm’s main airport had already been used for a meeting in 2003. (It is whimsically intentional that this system prevents the group from meeting in the same place too many times.)
Now let’s look at the new SQL features that were discussed. (Regards to all the GQL practitioners, but I’m not qualified enough to report on that.)
(Note that when I write below that something has been accepted into the standard, that means it’s now in a working draft. There is still a bit of a road until all of this is an approved actual standard.)
The QUALIFY clause has been accepted into the standard. This clause is a filter like WHERE or HAVING but is applied after window functions. For example1:
SELECT product_name, category, price
FROM products
QUALIFY price > avg(price) OVER (PARTITION BY category);
Previously, you would have to do this filtering by wrapping the query in a subquery, like
SELECT product_name, category, price
FROM (
SELECT product_name, category, price,
avg(price) OVER (PARTITION BY category)
AS category_avg
FROM products
) AS q
WHERE q.price > q.category_avg
The QUALIFY clause is already a nonstandard extension in many SQL implementations. A patch for PostgreSQL was in progress for PostgreSQL 19, but it was paused because some definitional issues had to be resolved, which has now been done with the change proposal to the standard. So I think we can look forward to seeing progress on that for PostgreSQL 20.
Upgrading PostgreSQL 19 clusters has become more seamless with tools like pg_upgrade and pg_createsubscriber, which together enable near-zero-downtime upgrades by first converting physical replicas into logical subscribers and then performing the upgrade with minimal service interruption.
However, this approach exposes a long-standing gap in logical replication: sequence state is not replicated.
Having written about the Swiss PGDay in 2024, I need not repeat all I said back then. Nonetheless, I'd like to share my impressions from the Swiss PGDay 2026 with you.
The Swiss PGDay traditionally takes place on the campus of the University of Applied Sciences at Rapperswil. It is perhaps the most inappropriately named PostgreSQL event I know. Only a strong feeling for tradition can maintain the name “PGDay” for a two-day event with two tracks of talks!
Another tradition is the speaker's and organizer's dinner on the evening before the conference. Since I was lucky enough to have my talk selected, I knew that I could look forward to a pleasant evening good food and with friends. Friends? Yes, it is a business trip for me, and yes, some of the people there work for competing companies. But then, we can leave the competition to the sales departments, and let's be honest: PostgreSQL keeps growing in a healthy fashion, and there should be enough cake for everybody to get happy.
This time, the dinner took place in the fancy restaurant on the city hall building, prominently situated in the main town square. The food was truly excellent, and the friendly waiters made sure that the wine glasses didn't run dry. The high temperature that prevailed during the entire event couldn't quell a happy reunion with Dirk, Daniel and Tobias, who happened to share my table.
This year, my co-worker Svitlana was also present at the dinner — she has helped organize the event. Her example may serve to highlight what PostgreSQL conferences do to you: her professional duties don't revolve around core PostgreSQL, as she is the leading developer of our commercial CYPEX database application generator. Still, she attended last year's Swiss PGDay, and — what shall I say? — it looks like she got hooked.
If all this sounds to you like PostgreSQL is some kind of weird cult that sucks in innocent people and takes over their p
[...]In the previous post, I tried to lay out the framing half of this material: what actually counts as a disaster, why preparation and prevention aren’t the same as recovery, and how RPO and RTO end up being conversations with leadership rather than numbers an infrastructure team gets to declare on its own.
That part is largely about understanding the problem. This part is about actually building the capability to deal with it. And in my experience, this is where most teams quietly stumble – not because they don’t have backups or replication, but because they’ve never really practiced using either of them under stress.
Once you’ve accepted that DR is a process, the natural next step is to write some runbooks. And in my experience, this is where a lot of teams quietly stumble.
Runbooks usually get written by experts, for experts, in a calm conference room. That’s almost the opposite of the environment they’ll actually be run in. Real recovery happens at 3AM, under stress, often with incomplete information and sometimes with someone who isn’t deeply familiar with the system. A runbook’s job, really, is to reduce ambiguity in that moment – not to be a comprehensive description of the system, but to be the thing you can follow even when you’re tired and scared.
That framing alone changes what a good runbook looks like.
A few patterns I see often that make runbooks worse, not better:
On June 23 2026, the London PostgreSQL Meetup Group met. Organized by:
Speakers:
On June 23, 2026, the Meetup PostgreSQL Lille met. Organized by Stefan Fercot and Yoann La Cancellera. Both delivered a talk at this meetup.
Postgres User Group Frankfurt am Main met on 24 June 2026 where Marc Linster delivered a talk. The Meetup was organized by
On June 25 2026 the PostgreSQL User Group NL met for their Summer Edition. Organized by Gerard Zuidweg and Feike Steenbergen.
Speakers:
The Program Committee of PGDay UK 2026 met to finalize the schedule:
Swiss PGDay 2026 took place from June 25-26 2026. Organized by:
Talk selection team:
Speakers:
Lightning Talk Speakers:
Everyone knows not to store money as a double precision. One can hope. The rule is so well drilled that it has stopped being interesting, and it is also not where the trouble usually starts. The float is already in the schema before anyone weighs in on it: a measurement column someone later sums for a report, telemetry that drifts into a finance dashboard, a third-party feed ingested as double precision because that is how it arrived.
Here is the part the rule does not warn you about. Take a table of five million floating-point readings, sum the column, and run it three times in a row. Nothing else touches the table. Same connection, same data, same statement.
SELECT sum(reading) FROM measurements;
sum
--------------------
2500519211.7874823
sum
-------------------
2500519211.787477
sum
--------------------
2500519211.7874575
Three runs, three different totals. No UPDATE, no concurrent writer, no random seed. The rows did not change between runs, and the query is the same character for character. Yet the answer is not.
This is not a Postgres bug, and it is not specific to Postgres. It is what happens when floating-point arithmetic meets parallel aggregation, and it has been generating "my dashboard total changed and I have no idea why" tickets for as long as databases have parallelized. The non-determinism does not wait for you to opt into bad practice; it shows up the moment a parallel plan runs over whatever floats you happen to have.
One column of double precision, five million rows.
CREATE TABLE measurements (
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
reading double precision NOT NULL
);
INSERT INTO measurements (reading)
SELECT random() * 1000
FROM generate_series(1, 5000000);
ANALYZE measurements;
Five million rows is enough that Postgres parallelizes the sum on its own. No settings forced, defaults all the way:
EXPLAIN (COSTS OFF) SELECT sum(reading) FROM measurements;
QUERY PLAN
--------------[...]
Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.