
© Laurenz Albe 2025
While I sometimes write about real-life problems I encounter in consulting sessions (see this example), I admit that curiosity and the desire to play are the motivation for many of my articles (like this one). So I am glad that a friend of mine, whose job as a DBA puts him on the front line every day, offered to let me quote from his (unreleased) memoirs. Finally, this is material that I hope will connect with all the hard-working souls out there! My friend insists on anonymity, hence the “bastard DBA from hell” in the title.
I like being a DBA. The only problem are the people using the databases.
A week ago I get a call from accounting. The guy sounds like I stole his sandwich, “We get deadlocks all the time.”
“Well, don't do it then.”, I tell him. I should have known that this kind of person wouldn't listen to good advice.
“It's your Postgre system that gives me these deadlock errors. So you fix them”.
“As you wish.”. >clickety click< “Ok, you shouldn't get any more deadlock errors now.”
He mumbles something and hangs up. It didn't sound like “thank you”. That's fine with me. After all, I set his lock_timeout to 500.
Yeaterday the guy from accounting called me again, in that accusing, complaining tone, “Our disk is 95% full.”
“I know, I got an alert from the monitoring system. Actually, I sent you guys an e-mail about it. 10 days ago.”
“Can't you fix that?”
“Do you want me to extend the file system?”
“We have no more data in there than before. So why do we need a bigger disk?”
I tell him “hang on” and log into their system. >clickety click<
“Well, you got a 350GB table there that's taking up all the space. But it's 99% dead data... Did you create that prepared transaction a month ago and never commit it?” >clickety click< “There, gone now. Just run a VACUUM (FULL) on the table and you'll be good.”
I thought I was r
[...]PostgreSQL 18 brings a couple of performance related features to the table which will help applications to run more efficiently, providing a better and more enjoyable user experience. One of those performance features is called “skip scans”. Most of you might ask yourself at this point: Wow, sounds cool, but what is a skip scan? The purpose of this post is to shed some light and explain how this works, what it does and most importantly: How one can benefit from this feature in real life.
To understand what a skip scan does, we first need to create some test data that is suitable for the challenge. Here is a classical example which can help to illustrate the topic nicely:
test=> CREATE UNLOGGED TABLE tab (
frequent integer,
rare integer
);
CREATE TABLE
test=> INSERT INTO tab
SELECT i / 100000, i % 100000
FROM generate_series(0, 999999) AS i;
INSERT 0 1000000
test=> VACUUM (ANALYZE) tab;
VACUUM
test=> CREATE INDEX ON tab (frequent, rare);
CREATE INDEX
We have created a table with just two columns. The first column contains just a handful of different values (so most of them are identical, making the index struggle). The second column contains many more different values. The challenge now is: How can we make use of the combined index we created efficiently?
In general, a btree in PostgreSQL can be seen as a sophisticated version of a sorted list. This is also true if we are dealing with a composite index. In our case, a subset of the index might look like this:
A composite index is sorted by “frequent” and “rare” (just like a phone book). Of course, there are various sophistications such as deduplication and a lot more, but the basic principle remains the same: We are dealing with a list sorted by those columns (or expressions) in the index. If you want to learn more about indexing, consider the following posts:
[...]Making GROUP BY a bit easier to use is in my experience among the top three requested features in SQL.
Like, if you do
CREATE TABLE t1 (a int, b int, ...);
SELECT a, avg(b) FROM t1 GROUP BY a;
the column list in the GROUP BY clause doesn’t convey much information. Of course you wanted to group by a, there is no other reasonable alternative. You can’t not group by a because that would be an error, and you can’t group by things besides a, because there is nothing else in the select list other than the aggregate.
The problem gets worse if you have a longer select list or complicated expressions, because you need to repeat these in the GROUP BY clause and carefully keep them updated if you change something. (Or you can try to work around this by using subqueries.) Could be easier!
A number of implementations1234 have adopted the syntax GROUP BY ALL to simplify this.
SELECT a, avg(b) FROM t1 GROUP BY ALL;
The SQL standard working group discussed this feature informally at the June 2025 meeting, and there was consensus about going forward with it.
At the September 2025 meeting, a formal change proposal was brought forward and accepted. (So, technically, it’s just a working draft right now, and it won’t be final until the standard is released.)
The formal meaning of GROUP BY ALL is that it expands to a list of the elements of the select list that do not contain aggregate functions. So in
SELECT a, avg(b) FROM t1 GROUP BY ALL;
a does not contain an aggregate function, but avg(b) does, so GROUP BY ALL resolves to GROUP BY a, as expected.
This doesn’t completely remove the need for explicit GROUP BY lists. Consider a more complicated case like
CREATE TABLE t1 (a int, b int, c int, d int, ...);
SELECT a, avg(b) + c + d FROM t1 GROUP BY ALL;
Here, a does not contain an aggregate function, but avg(b) + c + d does contain an aggregate function, so the query resolves to
SELECT a, avg(b) + c + d FROM t1 GROUP BY a;
But that is not valid, because you need t
[...]
Your API is slowing down. You check your database and find 42 indexes on your users table. Which ones can you safely drop? How much performance are they costing you? Let's look at what actually happens in Postgres when you have too many indexes.
If you're a backend or full-stack engineer, you probably don't want to become an indexing expert — you just want your API fast and stable, without babysitting pg_stat_user_indexes.
Index maintenance includes multiple activities: dropping unused indexes, dropping redundant indexes, and rebuilding indexes on a regular basis to get rid of index bloat (and of course, keeping autovacuum well tuned).
There are many reasons why we need to keep our index set lean, and some of them are tricky.
I keep collecting these ideas over years. Here's the current list (more to come):
As for index bloat, reasons 3-6 apply to bloated indexes as well. Plus, if an index is extremely bloated (e.g., 90%+, or >10x the optimal size, index scan latencies suffer. Postgres B-tree implementation lacks merge operations — once a page splits, those pages never merge back together even after deletions. Over time, this leads to increasing fragmentation and bloat. Deduplication in PG13+ helps compress duplicate keys and bottom-up deletion in PG14+ reduces bloat by removing dead tuples more aggressively during insertions. However, these features don't address structural degradation from page splits. Regular monitoring and rebuilding of bloated indexes remains essential maintenance work.
Let's examine each item from the list, studying Postgres source code (here, it's Postgre
[...]This quarter's round of minor releases are expected later this week. Note this will be the last minor release for PostgreSQL 13 .
PostgreSQL 19 changes this week logging wal_fpi_bytes value shown in log entries for VACUUM and ANALYZE operations logical replication support for sequence synchronization added pgbench option --continue-on-error added pg_stat_subscription_stats column seq_sync_error_count added psql \pset variables display_true and display_false added SET NULL now accepted for specifying an empty list WAIT FOR SQL command added PostgreSQL 18 articles PostgreSQL 18 enables data‑checksums by default (2025-11-03) - Josef Machytka / credativ What do the new Index Searches lines in EXPLAIN mean? (2025-10-10) - Michael Christofides / pgMustardNew podcast episode published by Claire Giordano from her series “Talking Postgres” : Building a dev experience for Postgres in VS Code with Rob Emanuele
Blog posts
Ellyne Phneah Key takeaways from the PostgreSQL & AI Summit: RAG, agents, and tuning
Claire Giordano Postgres Trip Summary from PGConf EU 2025
Postgres has an awesome amount of data collected in its own internal tables. Postgres hackers know all about this - but software developers and folks working with day to day Postgres tasks often miss out the good stuff.
The Postgres catalog is how Postgres keeps track of itself. Of course, Postgres would do this in a relational database with its own schema. Throughout the years several nice features have been added to the internal tables like psql tools and views that make navigating Postgres’ internal tables even easier.
Today I want to walk through some of the most important Postgres internal data catalog details. What they are, what is in them, and how they might help you understand more about what is happening inside your database.
The easiest way to get at some of Postgres’ internal catalogs is to use the built-in psql commands that begin \d generally. Here’s some common Postgres ones users should be comfortable using:
\d {tablename}: describes a specific table. \d will do a lot of things if you qualify \d with a table or view name.
\di: list all your indexes
\dx: list installed extensions
\dp: to show access privileges
\dp+: tables and views with the roles and access details
\dconfig: your current configuration settings
\dt {tablename}: describe a table
\dti+: tables and indexes with sizes
\dg+: show role names
\df: show your functions
\dv {view name}: describe a view
\l: lists all your databases
Postgres exposes many of the complex internals of the database system in easy-to-query views. These host a wealth of information about what is going on inside your database and direct SQL access to answer in the moment emergency questions like “what is taking up all my CPU” and more long term questions like “what are my 10 slowest queries”.
Shows current database activity, including running queries, state, and client information. Essential for troubleshoot
[...]
I'm happy to announce release 0.8 of pg_statviz, the minimalist extension and utility pair for time series analysis and visualization of PostgreSQL internal statistics.
This release adds support for PostgreSQL 18, adapting to significant catalog view changes introduced in this release:
pg_stat_io view now reports I/O activity in bytes (read_bytes, write_bytes, extend_bytes) rather than just operation counts, providing more granular insight into system I/O patterns.
pg_stat_wal to pg_stat_io (where object = 'wal'), consolidating I/O metrics across different subsystems.
The extension automatically detects your PostgreSQL version and uses the appropriate catalog views, maintaining backward compatibility with PostgreSQL 14 through 17 while taking advantage of the enhanced metrics available in PostgreSQL 18. The Python utility has also been updated to handle both the new byte-based I/O metrics and the operation counts from earlier versions.
This release also bumps the minimum Python requirement to 3.11+ and updates the numpy dependency to 2.3.4.
pg_statviz takes the view that everything should be light and minimal. Unlike commercial monitoring platforms, it doesn't require invasive agents or open connections to the database — it all lives inside your database. The extension is plain SQL and pl/pgSQL and doesn't require modules to be loaded, the visualization utility is separate and can be run from anywhere, and your data is free and easy to export.
pg_statviz from the PostgreSQL repositories or PGXN.
PostgreSQL 13 will officially reach End-of-Life (EOL) on November 13, 2025. After this date, the PostgreSQL Global Development Group will stop releasing security patches and bug fixes for this version. That means if you’re still running PostgreSQL 13, you’ll soon be on your own with no updates, no community support, and growing security risks. Why […]
In the third installment of the PostGIS Performance series, I wanted to talk about performance around bounding boxes.
Geometry data is different from most column types you find in a relational database. The objects in a geometry column can be wildly different in the amount of the data domain they cover, and the amount of physical size they take up on disk.
The data in the “admin0” Natural Earth data range from the 1.2 hectare Vatican City, to the 1.6 billion hectare Russia, and from the 4 point polygon defining Serranilla Bank to the 68 thousand points of polygons defining Canada.
SELECT ST_NPoints(geom) AS npoints, name
FROM admin0
ORDER BY 1 DESC LIMIT 5;
SELECT ST_Area(geom::geography) AS area, name
FROM admin0
ORDER BY 1 DESC LIMIT 5;
As you can imagine, polygons this different will have different performance characteristics:
Your spatial indexes are “r-tree” indexes, where each object is represented by a bounding box.
The bounding boxes can overlap, and it is possible for some boxes to cover a lot of the dataset.
For example, here is the bounding box of France.
What?! How is that France? Well, France is more than just the European parts, it also includes the island of Reunion, in the southern Indian Ocean, and the island of Guadaloupe, in the Caribbean. Taken together they result in this very large bounding box.
Such a large box makes a poor addition to the spatial index of all the objects in “admin0”. I could be searching in with a query key in the middle of the Atlantic, and the index would still be telling me “maybe it is in France?”.
For this testing, I have made a synthetic dataset of one million random points covering the whole world.
CREATE TABLE random_normal AS
SELECT id,
ST_Point(
random_normal(0, 180),
r[...]
Due to an unfortunate recent visitation by the Virus of the Decade (so far), I have a backlog of these which I'm trying to work through, so in the remote chance anyone is waiting with bated breath for the newest editions, my apologies. Normal service will be resumed as soon as humanly possible.
PostgreSQL 19 changes this week Add log_autoanalyze_min_duration PostgreSQL 19 articles Waiting for PostgreSQL 19 – Add IGNORE NULLS/RESPECT NULLS option to Window functions. (2025-10-13) - Hubert 'depesz' Lubaczewski PostgreSQL 18 articles Benchmarking Postgres 17 vs 18 (2025-10-14) - Ben Dickens / PlanetScaleAlmost a month late, but I hope you enjoy it!
In one of our earlier blogs, we explored how improper volatility settings in PL/pgSQL functions — namely using IMMUTABLE, STABLE, or VOLATILE — can lead to unexpected behavior and performance issues during migrations.
Today, let’s revisit that topic from a slightly different lens. This time, we’re not talking about your user-defined functions, but the ones PostgreSQL itself provides — and how their volatility can quietly shape your query performance, especially when you’re using partitioned tables.
When Partitioning Doesn’t Perform
When designing partitioning strategies, performance is usually the first thing that comes to mind.
But what if despite all your tuning the queries still fail to leverage partition pruning effectively?
The answer sometimes lies not in your schema or indexes, but in how your filters are evaluated.
Every function you use in a filter has its own volatility behavior that determines how it’s planned and executed. You might pick the right filter logically — but if the function is volatile, it can completely change how PostgreSQL builds and optimizes the plan.
Let’s say you’ve designed a table as daily partitions and use Date Functions as partitions filters and check its underlying executions plan.
CREATE TABLE IF NOT EXISTS events_daily (
id bigint generated always as identity ,
created_at timestamptz NOT NULL,
created_date date NOT NULL,
user_id int NOT NULL,
amount numeric(10,2) NOT NULL,
status text,
payload text
) PARTITION BY RANGE (created_date);
--create a year of partitions as daily
DO
$$
DECLARE
start_date date := current_date - INTERVAL '364 days';
d date;
part_name text;
BEGIN
d := start_date;
WHILE d <= current_date LOOP
part_name := format('events_p_%s', to_char(d, 'YYYYMMDD'));
EXECUTE format($sql$
CREATE TABLE IF NOT EXISTS %I PARTITION OF events_daily
FOR VALUES FROM (%L) TO (%L);
$sql$, part_name, d::text, (d + INTERVAL '1 day')::daAs a database consulting company, we are often faced with analytics and reporting related tasks which seem to be easy on the surface but are in reality not that trivial. The number of those seemingly simple things is longer than one might think, especially in the area of reporting
The purpose of this post is not to share some fancy SQL techniques, some super cool magic AI tool that does whatever new hot thing or some crazy coding technique one can use in PostgreSQL. This is all about two simple things that are often forgotten or just neglected. We are of course talking about “awareness” and “clarity”.
Yes, it is that simple. If you don’t know what you have or if you don’t know what you want, all data is worthless (in a best case scenario) or even dangerous.
One of those seemingly simple things might be related to a trivial question: “How many customers do you have?”. Every CEO will proudly talk about this number but have you ever asked those guys a simple question: “What is a customer?”.
Well, just try. In my experience the process often starts with an arrogant smile that indicates “What stupid question is that?”. But just dig deeper and keep asking for details. I can assure you: In the majority of cases, people have no clue what they are talking about.
Here are some typical scenarios:
One could continue this list forever and a day. In other words: It is not that simple and many "business people” tend to think that displaying things in a pie chart has the power to change the fundamental problem: “What do you actually count?”
The following data structure shows a simple scenario: A bank wants t
[...]
I pushed the most boring change imaginable, add an index. Our CI/CD pipeline is textbook ==> spin up a fresh DB, run every migration file in one single transaction, in sequential manner. If anything hiccups, the whole thing rolls back and the change never hits main. Foolproof autotests.
Enter The Drama Queen :
ERROR: cannot CREATE INDEX "index_name_xyz" on table abcdef_tab
because it is being used by active queries in this session
This session. Same PID, same transaction. No parallel runner. No second connection. Still blocked.
Our schema migration repository structure is as shown above. CI tool will create a fresh DB → execute all scripts chronlogically → one by one → one transaction → one session as a part of an autotest before merging any new code (dml/ddl for schema migration = infra as a code) to main branch.
Naturally ignoring last part of error message like a seasoned DBA, I accused CI tool of going rogue, maybe someone “optimized” migrations to run in parallel after reading a LLM pep talk about “unleashing concurrency.”
I printed the PID at the start and end of every file.
-- Stamp the migration transaction
SELECT 'start', pg_backend_pid(), txid_current();
-- ... run all migration steps ...
SELECT 'end', pg_backend_pid(), txid_current();
Same PID. Same transaction. Same sadness.
One of the least talked about but best Postgres feature as a FOSS project is that you can search error message in code base and documentation is excellent.
As I explained in my talk on PostgreSQL Conference Europe 2025, data corruption can be silently present in any PostgreSQL database and will remain undetected until we physically read corrupted data. There can be many reasons why some data blocks in tables or other objects can be damaged. Even modern storage hardware is far from being infallible. Binary backups done with pg_basebackup tool – which is very common backup strategy in PostgreSQL environment – leave these problems hidden. Because they do not check data but copy whole data files as they are. With release of PostgreSQL 18, the community decided to turn on data‑checksums by default – a major step toward early detection of these failures. This post examines how PostgreSQL implements checksums, how it handles checksum failures, and how we can enable them on existing clusters.
A PostgreSQL table or index is stored in 8 KB pages. When a page is written to disk, PostgreSQL computes a 16‑bit checksum using every byte of the page (except the checksum field itself) and the page’s physical block address. The checksum is stored in the page header. On every read, PostgreSQL recalculates the checksum and compares it against the stored value. Because the block address is part of the calculation, the system detects both bit flips within the page and pages written to the wrong place. Checksums are not maintained while the page sits in shared buffers – they are computed only when the page is flushed from the buffer cache to the operating system page cache. Consequently, an incorrect in‑memory page cannot be detected until it is written and read again. PostgreSQL uses a fast FNV‑1a hash (with CRC32C on WAL records) that is optimized for performance. On typical hardware the cost of calculating checksum seems to be small. A benchmarking studies found the penalty is usually less than 2 % for normal workloads. PostgreSQL 18’s release notes acknowledge that the overhead is non‑zero but accept it for the benefit of data integrit
One of the most read articles at boringSQL is Time to Better Know The Time in PostgreSQL where we dived into the complexities of storing and handling time operations in PostgreSQL. While the article introduced the range data types, there's so much more to them. And not only for handling time ranges. In this article we will cover why to consider range types and how to work with them.
But before we can talk about the range types, let's try to understand why we should look at them in the first place. Let's imagine a booking platform for large flash sales of the seats, that goes live at 10pm and will be taken by storm by thousands of people who want to get their tickets.
CREATE TABLE seat_holds (
hold_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
seat_id INTEGER NOT NULL REFERENCES seats(id),
user_session_id UUID NOT NULL,
-- define the hold period explicitly
hold_started_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
hold_expires_at TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX seat_holds_on_seat_id ON seat_holds(seat_id);
CREATE INDEX seat_holds_on_expiration ON seat_holds(hold_expires_at);
While the table design looks perfectly reasonable, it has one serious flaw - there's no database-level atomicity guarantee there to prevent two holds for the same seat_id at the same time. The table design requires on application logic to check for the existing holds before inserting a new hold, and at the same time it does not provide any high-concurrency guarantee.
If all you have in your toolbelt are those two columns you will end up increasing the complexity to make it work. You started with one problem and soon your application developers might want to ask you to add caching layer (most likely external K/V store) to place the holds there and very soon you have N-problems when you will resort to building a complex, custom application
[...]PostgreSQL received attention through the following contributions at Data Stack Conf 2025 on Oct 29, 2025:
Speaker
PostgreSQL Booth Staff
Gabriele Quaresima spoke at Cloud Native Bergen on Tuesday, October 28, 2025.
Community Blog Posts
Mayuresh B: Slonik on the Catwalk: PGConf.EU 2025 Recap
Floor Drees: My PGConf EU 2025 experience
Cornelia Biacsics: The PostgreSQL Village and Head of Marketing experiences @PGConf.EU 2025 – A Riga Recap
Jan Karremans: The Pulse of PostgreSQL: A Global Perspective from PGConf.EU 2025
Rafia Sabih: A Glimpse of PostgreSQL Conference Europe 2025
Valeria Kaplan: PostgreSQL — Blurring the Line Between Mine and Ours
I wrote a post in my MySQL blog on migrating from MySQL to PostgreSQL using DBeaver. You can pass it along to your acquaintances who want to get off the Dolphin and on the Elephant.
Not only will DBeaver move your tables and data, but you can compare them afterwards. In the post, I outline the process in five steps. DBeaver will let you do it in four.
Please share this blog with your friends to promote PostgreSQL.
High availability in PostgreSQL doesn't always require a globally distributed architecture. Sometimes you need reliable failover and replication within a single datacentre or region. pgEdge Enterprise Postgres handles this scenario with a production-ready PostgreSQL distribution that includes the tools you need for high availability out of the box.
From time to time I get to investigate issues with some sort of a batch process. It’s getting more and more common that such processes use very high memory limits (maintenance_work_mem and work_mem). I suppose some DBAs follow the logic that “more is better”, not realizing it can hurt the performance quite a bit.
Let me demonstrate this using an example I ran across while testing a fix for parallel builds of GIN indexes. The bug is not particularly interesting or complex, but it required a fairly high value for maintenance_work_mem (the initial report used 20GB).
In #PostgresMarathon 2-009 and #PostgresMarathon 2-010, we explored why execution 6 causes a lock explosion when building a generic plan for partitioned tables — the planner must lock all 52 relations because it can't prune without parameter values.
Today we'll test what actually happens with different plan_cache_mode settings.
Let's test empirically on Postgres 18 with the 12-partition table from the previous posts. This time, we'll insert 1M rows into each partition to get realistic query plans with index scans:
-- Insert 1M rows into each partition
do $$
declare
i int;
start_date date;
partition_name text;
begin
for i in 0..11 loop
start_date := '2024-01-01'::date + make_interval(months => i);
partition_name := 'events_' || to_char(start_date, 'YYYY_MM');
execute format(
'insert into %I (event_id, event_time, event_data)
select s, %L::timestamptz + (s * interval ''1 second''), ''data_'' || s
from generate_series(1, 1000000) s',
partition_name, start_date
);
end loop;
end $$;
vacuum analyze events;
Now let's see what different plan_cache_mode settings do.
Test 1: auto mode
As we already saw, with the default auto mode setting, Postgres decides whether to use custom or generic plans based on cost comparison. We saw, that first 5 runs it uses custom plan, then, on the important 6th call, it builds the generic plan (and we have the "lock explosion" we already studied), and then, on 7th call and further, it uses the generic plan.
But that was on empty tables. Here things will be different, because we have data. Let's see.
As before, prepare a statement and execute it multiple times, observing the number of relation-level locks and plan type counters:
prepare test (timestamptz) as
select event_id, event_data from events where event_time = $1;
-- Run this snippet to test each execution
begin;
explain (verbose) execute test(timestamptz '2024-06-06 00:00:00+00');
select
count(*) as lock_count,
array_agg(
distinct relation::regNumber of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.