Latest Blog Posts

Checkpoints, Write Storms, and You
Posted by Shaun Thomas in pgEdge on 2026-04-10 at 06:06

Every database has to reconcile two uncomfortable truths: memory is fast but volatile, and disk is slow but durable. Postgres handles this tension through its Write-Ahead Log (WAL), which records every change before it happens. But the WAL can't grow forever. At some point, Postgres needs to flush all those accumulated dirty pages to disk and declare a clean starting point. That process is called a checkpoint, and when it goes wrong, it can bring throughput to its knees.

A Bit About Checkpoints

Under normal operation, Postgres is remarkably polite about checkpoints. The  parameter (default 5 minutes) tells Postgres how often to perform a scheduled checkpoint, and  (default 0.9) tells it to spread the resulting writes over 90% of that interval. So a checkpoint timeout of 5 minutes means Postgres trickles dirty pages to disk over roughly 4.5 minutes, keeping IO impact to a minimum.This only applies to timed checkpoint behavior.The  parameter sets a soft limit on how much WAL can accumulate between checkpoints. When the WAL approaches that threshold (1GB by default), Postgres doesn't wait for the next scheduled checkpoint. Instead, it forces one immediately.These forced (or requested) checkpoints do not honor . Postgres needs to reclaim WAL space, so it flushes every dirty buffer to disk as fast as the IO subsystem will allow. On a busy system with a large  pool full of modified pages, this can completely saturate disk IO in seconds.It's like trying to drink from a firehose.

Rubber Meets the Road

To see this in action, we set up a modest test environment:
  • Hypervisor:
  •  
  • Proxmox
  • CPU:
  •  4x AMD EPYC 9454 cores
  • RAM:
  •  4GB
  • DB Storage:
  •  100GB @ 2,000 IOPS
  • WAL Storage:
  •  100GB @ 2,000 IOPS
  • OS:
  •  Debian 12 Bookworm
We initialized the database with  at a scale factor of 800, producing roughly 12GB of data (3x available RAM to reduce cache hits). We also followed the traditional advice of setting  to 25% of RAM, or 1GB in this case. All other set[...]

Waiting for PostgreSQL 19 – new pg_get_*_ddl() functions
Posted by Hubert 'depesz' Lubaczewski on 2026-04-09 at 16:37
On 5th of April 2026, Andrew Dunstan committed patch: Add pg_get_database_ddl() function   Add a new SQL-callable function that returns the DDL statements needed to recreate a database. It takes a regdatabase argument and an optional VARIADIC text argument for options that are specified as alternating name/value pairs. The following options are supported: pretty (boolean) … Continue reading "Waiting for PostgreSQL 19 – new pg_get_*_ddl() functions"

The 1 GB Limit That Breaks pg_prewarm at Scale
Posted by warda bibi in Stormatics on 2026-04-09 at 06:56

Recently, we encountered a production incident where PostgreSQL 16.8 became unstable, preventing the application from establishing database connections. The same behavior was independently reproduced in a separate test environment, ruling out infrastructure and configuration issues. Further investigation identified the pg_prewarm extension as the source of the problem.

This blog post breaks down the failure, the underlying constraint, why it manifests only under specific configurations, and the corresponding short-term mitigation and long-term fix.

What pg_prewarm Does

Every time PostgreSQL restarts, its shared buffer cache (the region of RAM it uses to hold frequently accessed data pages from disk) starts completely cold. Every query that touches data must go to disk first. On large production systems, this cold-start penalty can be severe, sometimes taking hours before the cache naturally warms through organic traffic.

pg_prewarm solves this in two ways. First, it gives you manual control so you can explicitly warm specific tables or indexes on demand, useful before a heavy batch job or a known query workload. Second, it ships with an autoprewarm mode that, when enabled, continuously tracks which pages are resident in shared buffers and automatically replays that list after a restart with no manual intervention required. For high-traffic systems with large shared_buffers, this is operationally critical.

The BUG

In order for autoprewarm to dump the list of cached pages, PostgreSQL must build an array in memory containing one entry per page currently in shared buffers.  Each entry (a BlockInfoRecord) is 20 bytes.

palloc(NBuffers * sizeof(BlockInfoRecord))

NBuffers is the number of shared buffer slots derived directly from shared_buffers setting. PostgreSQL’s standard memory allocator, palloc, enforces a hard ceiling of 1 GB on any single allocation.  Any palloc() call requesting more

[...]

pgcollection 2.0: Integer Keys, Range Deletes, and Oracle Parity
Posted by Jim Mlodgenski on 2026-04-09 at 00:01

In my first post about pgcollection, I introduced the collection type to address the challenge of migrating Oracle associative arrays keyed by strings to PostgreSQL. For integer-keyed associative arrays, I noted that native PostgreSQL arrays work well enough for simple cases. That holds true until the keys are sparse.

Consider this Oracle pattern:

DECLARE
  TYPE cache_t IS TABLE OF VARCHAR2(100) INDEX BY PLS_INTEGER;
  cache  cache_t;
BEGIN
  cache(1)       := 'first';
  cache(1000000) := 'millionth';
  DBMS_OUTPUT.PUT_LINE('Count: ' || cache.COUNT);  -- 2
END;

The equivalent attempt with a PostgreSQL array produces a different result:

DO $$
DECLARE
  a text[];
BEGIN
  a[1]       := 'first';
  a[1000000] := 'millionth';
  RAISE NOTICE 'Length: %', array_length(a, 1);  -- 1000000
END $$;

PostgreSQL fills positions 2 through 999,999 with NULLs. You asked for two entries and got a million-element array. Worse, it is impossible to distinguish between a key that was explicitly set to NULL and one that was never set at all. pgcollection now avoids both problems.

icollection

icollection is a 64-bit integer-keyed associative array that stores only the keys you set. The same pattern from above works as expected:

DO $$
DECLARE
  cache  icollection('text');
BEGIN
  cache[1]       := 'first';
  cache[1000000] := 'millionth';

  RAISE NOTICE 'Count: %', count(cache);  -- 2
  RAISE NOTICE 'Value: %', cache[1000000];
  RAISE NOTICE 'Key 500 exists: %', exist(cache, 500);  -- false
END $$;

icollection supports the same full set of operations as collection — subscript access, forward and reverse iteration, sorting, set-returning functions, and JSON casting — with bigint keys instead of text. It maps directly to Oracle’s TABLE OF ... INDEX BY PLS_INTEGER, with the keys widened to 64-bit so overflow is not a concern during migration.

The exist() function resolves the NULL ambiguity problem directly. With a PostgreSQL array, a[2] returns NULL whether the key was set to NULL or never set. With icollecti

[...]

AI at the Edge, Truth in Postgres
Posted by Vibhor Kumar on 2026-04-08 at 19:54
AI at the Edge, Truth in Postgres


A practical blueprint for secure, private, high-performance AI systems

Edge AI is having its inevitable moment. Not because the cloud is going away, but because reality keeps interrupting theory. Networks drop. Latency matters. Privacy rules get sharper teeth. Regulators ask harder questions. And in that world, the winning architecture is rarely the one with the flashiest model. It is the one that can still make the right decision when the link is weak, the clock is drifting, and the audit trail needs to hold up in daylight. As of April 2026, PostgreSQL 18 is the current major release, with 18.3 already out, and the surrounding governance landscape has moved too: the EU AI Act is now in phased application, and its broader 2026 obligations are close enough that “we’ll add governance later” is no longer a serious sentence. 

The core argument of this series still holds, and I would state it even more strongly now: at the edge, AI can be probabilistic, but your system of record cannot be. That is why PostgreSQL matters here. It is not just a database in this pattern. It is the local ledger, the policy boundary, the coordination plane, and often the simplest place to make trust real. PostgreSQL 18 strengthened that story with asynchronous I/O, OAuth authentication, continued row-level security capabilities, and ongoing logical replication improvements; meanwhile pgvector continues to make hybrid relational-plus-vector patterns more natural inside the same operational envelope. 

Edge is not a location. It is a latency budget and a failure budget.

A lot of edge architecture still gets described as geography. Factory floor. Retail store. Branch office. Vehicle. Hospital wing. That is useful, but incomplete. Edge is really the place where your acceptable latency, privacy boundary, and resilience needs collide. If a decision must happen in tens of milliseconds, if the raw data should not leave the site, or if the system must keep working through intermittent connectivity, then the architecture ha

[...]

Contributions for week 13, 2026
Posted by Cornelia Biacsics in postgres-contrib.org on 2026-04-07 at 08:58

The Prague PostgreSQL Meetup met on March 30, 2026, organized by Gulcin Yildirim Jelinek and Mayur B.

Speakers:

  • Radim Marek
  • Mayur B.

Community Blog Posts:

Community Videos:

Using the pgEdge MCP Server with a Distributed PostgreSQL Cluster
Posted by Ahsan Hadi in pgEdge on 2026-04-07 at 06:36

I recently wrapped up my blog series covering the exciting new features in PostgreSQL 18 — from Asynchronous I/O and Skip Scan to the powerful RETURNING clause enhancements. If you haven't had a chance to read them yet, head over to pgedge.com/blog where you'll also find some great content from my colleagues on how PostgreSQL is embracing the AI revolution.Speaking of the AI revolution — in this blog I want to shift gears and dive into something I've been genuinely excited to explore: using the pgEdge MCP Server with a distributed PostgreSQL cluster. I'll explore one of those AI tools firsthand — the pgEdge MCP Server — and specifically what it looks like to connect it to a true distributed PostgreSQL cluster.The Model Context Protocol (MCP) has quickly become the standard way to connect Large Language Models (LLMs) to external data sources and tools. With the release of the pgEdge Agentic AI Toolkit, PostgreSQL developers and DBAs can now connect AI assistants like Claude directly to their databases through the pgEdge Postgres MCP Server.In this blog, I'll focus specifically on what makes using the MCP Server (used with a pgEdge Distributed PostgreSQL cluster) interesting and different from a single-node setup. I'll walk through the setup, and demonstrate practical examples where the MCP Server combined with a distributed cluster becomes a powerful tool for DBAs and developers alike.

A Quick Overview of the pgEdge MCP Server

The pgEdge Postgres MCP Server is part of the pgEdge Agentic AI Toolkit. It gives AI assistants secure, structured access to your PostgreSQL database - not just raw query execution, but deep schema introspection, performance metrics, and the ability to reason about your data model. Once connected, Claude (or other LLMs) can understand your schema, identify slow queries, inspect index usage, and help you write optimized SQL - all through natural language.The following functionality sets the pgEdge MCP Server apart from other Postgres MCP servers:
  • Full schema introspection
  •  — pr
[...]

Schemas in PostgreSQL and Oracle: what is the difference?
Posted by Laurenz Albe in Cybertec on 2026-04-07 at 05:28

Two engineers talking. One says, "... then you log into the database as schema SYSTEM ..." and the other one thinks, "We shouldn't have tried to build that tower in Babylon."
© Laurenz Albe 2026

Recently, somebody asked me for a reference to a blog or other resource that describes how schemas work differently in Oracle. I didn't have such a reference, so I'm writing this article. But rather than just describing Oracle schemas to a reader who knows PostgreSQL, I'll try to present the topic in a way that helps Oracle users understand schemas in PostgreSQL as well. Since I already wrote about the difference between database transactions in Oracle and PostgreSQL, perhaps this can turn into a series!

The common ground: what is a schema

Schemas are defined by the SQL standard, so it is no surprise that there are a lot of similarities. Essentially, a schema is the same thing in Oracle and PostgreSQL: A named collection of database objects. Database objects that reside in a schema are called schema objects. Schemas have nothing to do with object storage, they give the database logical structure, very similar to directories in the file system. A schema is also a namespace: there cannot be any two tables with the same name in the same schema.

In the remaining article, I will explore the differences:

  • the relationship between users and schemas
  • the scope of a schema's namespace
  • schemas and privileges (permissions)
  • schemas and object ownership
  • the default schema for unqualified object names
  • system schemas

Users and schemas

This is probably the topic where the differences between both database management systems are most pronounced.

Oracle

Oracle has both database users and schemas, but it enforces a one-to-one correspondence between the two: Creating a user automatically creates a schema with the same name, and there is no other way to create a schema (the standard SQL statement CREATE SCHEMA exists, but doesn't create a schema). Because of that correspondence, many people don't clearly distinguish between “user” and “schema”. It is not unusual to hear an Oracle administrator say “Connect to the database as schema X” or “the tabl

[...]

pg_column_size(): What you see is not what you get
Posted by Lætitia AVROT on 2026-04-07 at 00:00
Thanks to my colleague Ozair, who sent me a JIRA ticket saying “I need to drop that huge column, what are the consequences?” My first question was: how huge? That’s when the rabbit hole opened. It looks simple. It is simple. Just use the administrative function pg_column_size(). Until you have toasted attributes. Then it gets interesting. A bit of history 🔗pg_column_size() was added in PostgreSQL 8.1 by Mark Kirkwood (commit a9236028).

pg_clickhouse 0.1.10
Posted by David Wheeler on 2026-04-06 at 21:38

Hi, it’s me, back again with another update to pg_clickhouse, the query interface for ClickHouse from Postgres. This release, v0.1.10, maintains binary compatibility with earlier versions but ships a number of significant improvements that increase compatibility of Postgres features with ClickHouse. Highlights include:

  • Mappings for the JSON and JSONB -> TEXT and ->> TEXT operators, as well as jsonb_extract_path_text() and jsonb_extract_path(), to be pushed down to ClickHouse using its sub-column syntax.
  • Mappings to push down the Postgres statement_timestamp(), transaction_timestamp(), and clock_timestamp() functions, as well as the Postgres “SQL Value Functions”, including CURRENT_TIMESTAMP, CURRENT_USER, and CURRENT_DATABASE.
  • And the big one: mappings to push down compatible window functions, including ROW_NUMBER, RANK, DENSE_RANK, LEAD,LAG, FIRST_VALUE, LAST_VALUE, NTH_VALUE, NTILE, CUME_DIST, PERCENT_RANK, and MIN/MAX OVER.
  • Oh yeah, the other big one: added result set streaming to the HTTP driver. Rather that load all the results A testing loading a 1GB table reduced memory consumption from over 1GB to 73MB peak.

We’ll work up a longer post to show off some of these features in the next week. But in the meantime, git it while it’s hot!

Thanks to my colleagues, Kaushik Iska and Philip Dubé for the slew of pull requests I waded through this past week!

Don't let your AI touch production
Posted by Radim Marek on 2026-04-06 at 20:24

Not so long ago, the biggest threat to production databases was the developer who claimed it worked on their machine. If you've attended my sessions, you know this is a topic I'm particularly sensitive to.

These days, AI agents are writing your SQL. The models are getting incredibly good at producing plausible code. It looks right, it feels right, and often it passes a cursory glance. But "plausible" isn't a performance metric, and it doesn't care about your execution plan or locking strategy.

AI-generated SQL is syntactically correct, which is the easy part. The hard part is knowing what a statement does to a running system: which locks it takes, how long it holds them, whether it rewrites the table on disk.

The spectacular failures get the headlines. In July 2025, an AI coding agent wiped a production database during a code freeze -- ran destructive commands, panicked, then lied about what it had done.

But the biggest damage is quieter. It's the migration that passes every test, ships through CI and then locks a production table during peak traffic. The query written with random assumptions. Indexes added based on copy/paste from psql. The cumulative effect that builds over time, when nobody is looking.

How to give AI agent "eyes"

When you're running Claude Code or any other agentic coding tool, the bottleneck isn't the model's intelligence. It's the fidelity of the environment. Standard AI coding involves the agent guessing column names based on your description, or in better cases parsing a schema.sql file or using a local database with seed data.

None of these give the agent the one thing it actually needs: awareness of your production schema. The table sizes, the indexes, the constraints, the statistics that determine whether a query index-scans in 2ms or sequential-scans for 40 seconds.

The obvious fix is to give it a database connection. Let it query pg_catalog, read table sizes, check existing indexes. This is how Anthropic's reference PostgreSQL MCP server worked, and

[...]

WAL as a Data Distribution Layer
Posted by Richard Yen on 2026-04-06 at 08:00

Introduction

Every so often, I talk to someone working in data analytics who wants access to production data, or at least a snapshot of it. Sometimes, they tell me about their ETL setup, which takes hours to refresh and can be brittle, with a lot of monitoring around it. For them, it works, but it sometimes gets me wondering if they need all that plumbing to get a snapshot of their live dataset. Back at Turnitin, I set up a way to get people access to production data without having to snapshot nightly, and I thought maybe I should share it with people here.

Common Implementations and Their Risks

Typical solutions that we might encounter as we give people a little bit of access to production data:

1. Query the primary

This is generally a bad idea, since you don’t want users getting access to the production prirmary, lest they make some mistakes or do something to lock up tables that prevent customers from using your apps. Even with a read-only user, large data analytics queries could cause unwanted interference that negatively affect your uptime. This is almost certainly not the way to go.

2. Query a streaming replica

This is better, but doing this is not free. Long-running queries can create replay lag, vacuum conflicts can cancel queries, and I/O contention can affect the primary upstream. It’s safer since users are forced to be read-only, but that still carries risk.

3. Nightly snapshots / rebuilds

Having time-based snapshots and rebuilds are the most common form of getting data out to analysts. ETL queries run at night (or some other specified regular interval) and provide the information needed to do the necessary work. This works, but is another piece of software that produces somewhat stale data, depending on how much stale-ness can be tolerated.

Once Upon a Time, Before Streaming Replication

If you’ve spent any time in Postgres, you already understand streaming replication. Primary sends WAL to standby, and standby replays the WAL stream. All the tutorials tal

[...]

PAX: The Storage Engine Strikes Back
Posted by Lætitia AVROT on 2026-04-06 at 00:00
Thanks to Boris Novikov, who pointed me in the PAX direction in the first place and followed up with many insightful technical discussions. I’m grateful for all his time and the great conversations we’ve had and continue to have. To dive deeper into the mechanics of PAX, I highly recommend checking out my previous post: PAX: The cache performance you’re looking for. PAX looks elegant on paper: minipages, cache locality, column-oriented access inside an 8KB page.

Using non ACID storage as workaround instead missing autonomous transactions
Posted by Pavel Stehule on 2026-04-03 at 05:57

When I was younger, the culture war (in my bubble) was about transactional versus non-transactional engines, Postgres versus MySQL (MyISAM). Surely, I preferred the transactional concept. Data integrity and crash safety is super important.  But it is not without costs. It was visible 30 years ago, when MySQL was a super fast and PostgreSQL super slow database. Today on more powerful computers it is visible too, not too strong, but still it is visible. And we still use non-transactional storages a lot of - applications logs. 

 There are some cases when performance wins over consistency, and it can be acceptable. When I thought about non-transactional storages, I got one idea. It can be great replacement for missing autonomous transactions. But how to test it. Fortunately I found a csv_tam storage implemented by Alexey Gordeev.  This storage is mostly a concept with a lot of limits. But the idea is great - csv is a strong protocol - it is not block based, it has no row headers - so it can be very hard to support transactions. On second hand, it is primitive, and without any buffering and with forcing syncing after any row, it is mostly crash safe (against Postgres crash). Sure - it is not as safe as block storage ensured by WAL, but can be safe enough - billions applications use this safety for logging today. 

I did fork and fixed build on pg 17+. Now all types are supported and writing from parallel writes should be safe. It doesn't write to WAL, so these tables cannot be backuped and cannot be replicated - what can be a nice game to support it. It is not easy to do that in a non-block format. But for testing it is enough, and I believe so this extension is very simple, so it is enough for non critical environments. It is really very very simple. 

Postgres has not autonomous transactions. There are some workarounds like using dblink or pg_background.  As usual any workaround has some disadvantages and limits. pg_background looks good, but at the end, it doesn't ensure 100% suc

[...]

What is a Collation, and Why is My Data Corrupt
Posted by Shaun Thomas in pgEdge on 2026-04-03 at 05:36

The GNU C Library (glibc) version 2.28 entered the world on August 1st, 2018 and Postgres hasn't been the same since. Among its many changes was a massive update to locale collation data, bringing it in line with the 2016 Edition 4 release of the ISO 14651 standard and Unicode 9.0.0. This was not a subtle tweak. It was the culmination of roughly 18 years of accumulated locale modifications, all merged in a single release.Nobody threw a party.What followed was one of the most significant and insidious data integrity incidents in the history of Postgres. Indexes silently became corrupt, query results changed without warning, and unique constraints were no longer trustworthy. The worst part? You had to know to look for it. Postgres didn't complain. The operating system didn't complain. Everything appeared normal, right up until it wasn't.This is the story of how a library upgrade quietly corrupted databases around the world, what the Postgres community did about it, and how to make sure it never happens to you again.

What even is a Collation?

Before we can understand what broke, we need to understand what a collation actually does. At its core, a collation defines how text is compared and sorted. That sounds simple enough, but collation rules become much more turbulent outside of the English alphabet.Consider the German letter ß. Does it sort the same as "ss"? Usually. What about accented characters like é and è? Should they be treated as equivalent to "e" for sorting purposes, or should they have their own distinct positions? What about the Swedish alphabet, where ä and ö come after z rather than being treated as variants of a and o?Every language has its own answer to these questions, and a collation encodes those answers into a set of rules for a database to follow. When Postgres needs to sort a column of text, enforce a unique constraint, or build a B-tree index, it asks the collation: "Which of these two strings comes first?" The collation's answer determines everything from query results to whether an i[...]

pg_clickhouse 0.1.6
Posted by David Wheeler on 2026-04-02 at 15:21

We fixed a few bugs this week in pg_clickhouse, the query interface for ClickHouse from Postgres The fixes, improve query cancellation and function & operator pushdown, including to_timestamp(float8), ILIKE, LIKE, and regex operators. Get the new v0.1.6 release from the usual places:

Thanks to my colleague, Kaushik Iska, for most of these fixes!

pgEdge MCP Server for Postgres Is Now GA. Here’s Why That Matters
Posted by Antony Pegg in pgEdge on 2026-04-02 at 12:08

If you’re building agentic AI applications, you’ve probably already hit the wall where your LLM needs to actually talk to a database. Not just dump a schema and hope for the best, but genuinely understand the data model, write reasonable queries, generate code for new UIs and even entire applications, and do it all without you holding its hand through every interaction. That’s the problem MCP servers are supposed to solve, and most of them do a decent enough job of it when you’re prototyping on your laptop.Production is a different story.The pgEdge MCP Server for Postgres is now generally available, and it’s also available as a managed service inside pgEdge Cloud. We built it because the gap between “works in a demo” and “runs in production with real security, real availability requirements, and real compliance constraints” is wider than most people realize, and the existing MCP servers out there weren’t closing it.

The Problem With Most MCP Servers

Here’s what typically happens. You grab an MCP server, wire it up to Claude Code or Cursor, point it at a local Postgres instance, and everything works great. Your LLM can introspect the schema, write queries, generate application code and UIs, and even suggest optimizations. You feel like you’re living in the future.Then someone asks you to run it against the production database. The one with PII in it. The one that needs to stay in eu-west-1 or on-prem for compliance reasons. The one that can’t go down because three other services depend on it. And suddenly you’re staring at a tool that doesn’t support TLS, doesn’t have real authentication, can’t enforce read-only access, and definitely wasn’t designed to run in an air-gapped environment.So we built one that closes it. The pgEdge MCP Server works with any standard Postgres database running v14 or newer (not just pgEdge’s own products), and it’s designed from the ground up for the kind of environments where “just spin it up and see what happens” isn’t an acceptable deployment strategy.

What Ships in pgEdge

[...]

Waiting for PostgreSQL 19 – Add UPDATE/DELETE FOR PORTION OF
Posted by Hubert 'depesz' Lubaczewski on 2026-04-02 at 10:51
On 1st of April 2026, Peter Eisentraut committed patch: Add UPDATE/DELETE FOR PORTION OF   This is an extension of the UPDATE and DELETE commands to do a "temporal update/delete" based on a range or multirange column. The user can say UPDATE t FOR PORTION OF valid_at FROM '2001-01-01' TO '2002-01-01' SET ... (or likewise … Continue reading "Waiting for PostgreSQL 19 – Add UPDATE/DELETE FOR PORTION OF"

pg_background v1.9: a calmer, more practical way to run SQL in the background
Posted by Vibhor Kumar on 2026-04-01 at 23:30

There is a kind of database pain that does not arrive dramatically. It arrives quietly.

A query runs longer than expected. A session stays occupied. Someone opens another connection just to keep moving. Then another task shows up behind it. Soon, a perfectly normal day starts to feel like too many people trying to get through one narrow doorway.

That is where pg_background becomes useful.

It lets PostgreSQL run SQL in background workers while the original session stays free. The work still happens inside the database, close to the data, but the caller no longer has to sit there waiting for every long-running step to finish. At its heart, pg_background is about giving PostgreSQL a cleaner way to handle asynchronous work without forcing teams to leave the database just to keep a session responsive.

TL;DR

pg_background lets PostgreSQL execute SQL asynchronously in background worker processes so the calling session does not stay blocked. It supports result retrieval through shared memory queues, autonomous transactions that commit independently of the caller, explicit lifecycle control such as launch, wait, cancel, detach, and list operations, and a hardened security model with a NOLOGIN role, privilege helpers, and no PUBLIC access.

Version 1.9 adds worker labels, structured error returns, result metadata, batch operations, and compatibility across PostgreSQL 14, 15, 16, 17, and 18.

What changes in day-to-day life with v1.9

Version 1.9 improves the operator experience in ways that matter during real work.

Worker labels reduce guesswork when several tasks are running at once. Structured error returns make it easier for scripts and applications to react intelligently when background work fails. Result metadata makes it possible to inspect completion state without consuming the result stream. Batch operations simplify cleanup when a session launches several workers.

Taken together, these additions make pg_background easier to live with. That is the real value of this rele

[...]

Replicating CrystalDBA With pgEdge MCP Server Custom Tools
Posted by Antony Pegg in pgEdge on 2026-04-01 at 07:44

A disclaimer before we start: I'm product management, no longer an engineer. I can read code, I can write it … incredibly slowly. I understand PostgreSQL at a product level, and I know what questions to ask. But the code in this project was written by Claude - specifically, Claude Code running in my terminal as a coding agent. I directed the architecture, made the design calls, reviewed the output, and did the testing. Claude wrote the code. This is a vibe-coding story as much as it is a technical one.The pgEdge Postgres MCP Server has a custom tool system. You write PL/pgSQL in a YAML file, drop it into the config, and the server exposes it as an MCP tool. I wanted to find out how far that system could go - not with toy examples, but with something genuinely hard.

The pgEdge Postgres MCP Server

The pgEdge Postgres MCP Server connects AI agents and tools - Claude Code, Claude Desktop, Cursor, and others - directly to any PostgreSQL database. It supports Postgres 14 and newer, including standard community Postgres, Amazon RDS, and managed services. It's open source under the PostgreSQL licence.The server handles multi-database connections, user and token authentication, TLS, read-only enforcement, and optional write access. It runs over both  and HTTP transports. You can read more about it on the pgEdge AI Toolkit page or in the documentation.The feature that matters for this post is custom tools. You define MCP tools in a YAML file - SQL queries, PL/pgSQL code blocks, or stored function calls - and the server exposes them alongside its built-in tools. No recompilation, no plugins. Drop in the YAML, point the config at it, restart.

CrystalDBA: The Benchmark

The Crystal DBA team built one of the most popular Postgres MCP servers on GitHub - over 2,000 stars, and deservedly so. Their  server packs genuine DBA intelligence into a clean MCP interface: health checks, top query analysis, and an index tuner that uses real cost-based simulation. It's impressive work.One thing worth noting: the project's developm[...]

Postgres Vacuum Explained: Autovacuum, Bloat and Tuning
Posted by Elizabeth Garrett Christensen in Snowflake on 2026-03-31 at 10:24

If you’ve been using Postgres for a while, you’ve probably heard someone mention "vacuuming" the database or use the term “bloat.” These both sound choresome and annoying — but they’re just part of life in a healthy database. In modern Postgres versions, autovacuum usually handles these issues for you behind the scenes. But as your database footprint grows, you might start wondering: Is the default setting enough? Do I need to vacuum Postgres manually? or Why is my database suddenly taking up way more disk space than it should?

Let’s dive into why we vacuum, how autovacuum works and when you actually need to step in and tune it.   

Why does Postgres need to be vacuumed?

Postgres uses multiversion concurrency control (MVCC) to handle simultaneous transactions. When you update or delete a row, Postgres doesn't actually erase it from the disk immediately. Instead, it marks that row as "deleted" using a hidden transaction ID field, resulting in a “dead tuple.”  

These deleted rows waiting for cleanup are called bloat. The Postgres vacuum process comes along every so often to find these dead tuples and makes their space available for reuse. If you don't vacuum, your tables and indexes just keep growing, consuming disk space and degrading performance, even if your actual data volume stays the same. 

Beyond just space, vacuuming handles another critical task: preventing transaction ID (XID) wraparound. Postgres uses a 32-bit counter for transactions. If you run through 2 billion transactions without vacuuming to "freeze" old rows, you could be facing a wraparound that will shut Postgres down. Vacuuming marks old tuples as "frozen" so their IDs can be safely reused.   

How does autovacuum work?

Postgres has had autovacuum since 2005/8.1, and it has been turned on by default since 2008/8.3. You can turn it off table by table, though in general this is not recommended.

It uses a specific formula to decide when a table is "dirty" enough to need a cleanup. By default, it triggers a vacuum w

[...]

Why Ora2Pg Should Be Your First Stop for PostgreSQL Conversion
Posted by Deepak Mahto on 2026-03-31 at 08:16

I have been doing Oracle-to-PostgreSQL migrations for over last decades across enterprises, cloud platforms, and everything in between. I have used commercial tools, cloud-native services, and custom scripts. And when it comes to table DDL conversion, I keep coming back to the same tool: Ora2pg. Not because it is the flashiest or the easiest to set up, but because once you understand its configuration model, nothing else comes close to the control it gives you. This post is my attempt to convince you of the same and to walk you through the specific features that I use on every single engagement.

Let me be direct if you are doing an Oracle-to-PostgreSQL migration and you have not tried Ora2pg for your DDL conversion, you are making your life harder than it needs to be. This is a 20+ year battle-tested open-source project that has seen more Oracle schemas than most of us ever will.

This post is specifically about Table DDL conversion and the transformation knobs Ora2pg gives you. Not PL/SQL. Not data export. Just the DDL side because that alone deserves a dedicated conversation.

What Ora2Pg Actually Is (Quick Primer)

Ora2Pg connects to your Oracle database, scans it automatically, extracts its structure or data or code, and generates PostgreSQL-compatible SQL scripts. One config file ora2pg.conf controls everything. Set your DSN, set your schema, set the export type, and you are off.


  
ora2pg -t TABLE -c ora2pg.conf -o tables.sql

That one command gets you tables with constraints – primary keys, foreign keys, unique constraints, check constraints all translated and ready to load. It handles sequences, indexes, partitions, grants, views, the works. But let us focus on what makes it genuinely powerful for DDL work: the transformation section.

The Transformation Section: Where the Real Power Lives

The ora2pg.conf transformation directives are what separate Ora2pg from a simple schema dumper. You are not just getting a mechanical type mapping, you are getting a config

[...]

Patroni: Cascading Replication with Stanby Cluster
Posted by Umut TEKIN in Cybertec on 2026-03-31 at 05:42
Patroni: Cascading Replication with Stanby Cluster

Patroni is a widely used solution for managing PostgreSQL high availability. It provides a robust framework for automatic failover, cluster management, and operational simplicity in PostgreSQL environments. Patroni offers many powerful features that make PostgreSQL clusters easier to manage while maintaining reliability and operational flexibility.

In this article, we will focus on one of these powerful features: Patroni standby cluster. Recently, we have received many questions about this feature, so we decided to prepare a practical guide explaining how it works and how it can be configured.

Fundamentally, a Patroni standby cluster implements cascading replication. Within the standby cluster, there is a node that acts as a standby leader. Although it behaves as the leader inside the standby cluster, it is actually a replica of the primary Patroni cluster. The remaining nodes in the standby cluster replicate from this standby leader.

Why do we use Patroni Standby Cluster?

There are several reasons why we deploy a Patroni standby cluster. The most common use cases can be grouped into four main categories;

  • Geographic Redundancy/Disaster Recovery: If the entire primary cluster becomes unavailable due to a data center failure, infrastructure outage, or other catastrophic event, the standby cluster can be promoted to become the new primary cluster.
  • Controlled Migrations: A standby cluster allows teams to migrate an existing PostgreSQL cluster to new hardware, a new data center, or another cloud provider with minimal downtime.
  • Isolated Load Balancing/Read Scaling: Applications running in another region can perform read operations on the standby cluster, reducing network latency to the primary cluster.
  • Testing: The standby cluster can be used for disaster recovery testing, failover simulations, and backup validation without impacting the primary production cluster.

Prerequisites

Before setting up a standby cluster, several requirements must be sa

[...]

pg_service.conf: the spell your team forgot to learn
Posted by Lætitia AVROT on 2026-03-31 at 00:00
I’ll be honest with you. I’m old school. My IDE is vim. My PostgreSQL client is psql. I’ve been using it for almost 20 years and I still think it’s the best PostgreSQL client out there. I might be biased. When I joined a new team recently, I noticed something: nobody was using psql. Everyone was connecting through their IDE. Which, fair enough, IDEs have saved connection profiles. You click, you’re in.

The Hidden Behavior of plan_cache_mode
Posted by Richard Yen on 2026-03-30 at 08:00

Introduction

Most PostgreSQL users use prepared statements as a way to boost performance and prevent SQL injection. Fewer people know that the query planner silently changes the execution plan for prepared statements after exactly five executions.

This behavior often surprises engineers because a query plan can suddenly shift—sometimes dramatically, even though the query itself hasn’t changed. The reason lies in the planner’s handling of custom plans vs generic plans, controlled by the parameter plan_cache_mode.


Custom Plans vs Generic Plans

When a prepared statement is executed with parameters, the planner has two choices:

  1. Custom Plan: Generated using the actual parameter values. It is potentially optimal for that specific execution but requires planning overhead every time.
  2. Generic Plan: Planned once without knowing specific parameter values. It is reused for all subsequent executions to save planning overhead.

By default, plan_cache_mode is set to auto. In this mode, the planner uses custom plans for the first five executions. On the sixth execution, it compares the average cost of those custom plans against the estimated cost of a generic plan. If the generic plan is deemed “cheaper” or equal, the planner switches to it permanently for that session.


Demonstrating with pgbench

As always, pgbench is the schema of choice when it comes to simple demonstrations. I’m using Postgres 18, which is the latest version as of this writing. Adding a column with highly skewed values makes it easier to trigger the switch, for the purposes of this post. Therefore we add a flag column with extreme skew: 'N' for 0.1% of rows, 'Y' for the remaining 99.9%:

### In bash:
pgbench -i -s 10 -U postgres postgres

### In psql:
ALTER TABLE pgbench_accounts ADD COLUMN flag CHAR(1) NOT NULL DEFAULT 'Y';
UPDATE pgbench_accounts SET flag = 'N' WHERE aid <= 1000;
CREATE INDEX idx_accounts_flag ON pgbench_accounts(flag);
ANALYZE pgbench_accounts;

SELECT flag, count(*) FROM pgbench_acc
[...]

Contributions for week 12, 2026
Posted by Cornelia Biacsics in postgres-contrib.org on 2026-03-30 at 07:31

From March 23 to March 26, the following contributions were made to PostgreSQL at SREcon26 (Americas):

PostgreSQL booth volunteers:

  • Aya Griswold
  • Erika Miller
  • Gabrielle Roth
  • Jennifer Scheuerell
  • Umair Shahid
  • Alex Wood

PostgreSQL speakers:

  • Clint Byrum
  • Ben Dicken
  • Umair Shahid

On March 24, 2026 Nordic PGDay 2026 took place in Helsinki, Finland.

Organizers:

  • Daniel Gustafsson
  • Magnus Hagander
  • Georgios Kokolatos

Call for Paper Committee:

  • Georgios Kokolatos
  • Louise Leinweber
  • Liisa Hämäläinen
  • Thea Stark

Speakers:

  • Pavlo Golub
  • Mats Berglin
  • Miguel Toscano
  • Ants Aasma
  • Alexander Kukushkin
  • Henrietta Dombrovskaya
  • Gianni Ciolli
  • Chris Ellis
  • Josef Machytka
  • Teresa Lopes
  • Tomas Vondra
  • Vik Fearing
  • James McDonald
  • Nina Angelvik
  • Radim Marek
  • Pat Wright
  • Ellyne Phneah
  • Ryan Booz

Volunteers:

  • Boris Novikov
  • Jesper St John
  • Maija Linnakangas
  • Pavlo Golub
  • Thea Stark

PGDay Paris 2026 happened on Thursday March 26 in Paris, France, organized by:

  • Carole Arnaud
  • Hécate
  • Vik Fearing

Speakers:

  • Luigi Nardi
  • Marc Linster
  • Daniel Westermann
  • Derk van Veen
  • Magnus Hagander
  • Catherine Bouxin
  • Teresa Lopes
  • Damien Clochard
  • Matt Cornillon
  • Stéphane Schildknecht
  • Sébastien Lardière
  • Pavlo Golub
  • Ryan Booz
  • Mayuresh Suresh Bagayatkar

Call for Paper Committee:

  • Pavlo Golub
  • Sarah Conway
  • Valeria Kaplan

Code of Conduct Committee:

  • Erik de Ruiter
  • Floor Drees
  • Svitlana Lytvynenko

Community Blog Posts

Prairie Postgres Second Developers’ Summit and Why You Should Participate
Posted by Henrietta Dombrovskaya on 2026-03-29 at 21:38

In my current position as Database Architect at DRW, I talk with end users more than I ever did in my life. Our end users are application developers who look at PostgreSQL from a very utilitarian perspective. Trust me, they do not care whether Postgres is the most advanced DBMS or not. They are very pragmatic: they need a database that will help them accomplish their goals: write the data fast, store reliably, read anything in milliseconds, and run analytics.

None of the software engineering positions lists the knowledge of any relational database as a requirement. Or any database for that matter. It’s a “nice to have” at best. They come to me asking: is Postgres a good tool for what we need the database to do? And I always reply: Postgres is good for anything you need to do, but you need to know how to use it.

We develop new cool features and make Postgres more powerful; however, many of these features are barely used. Big consulting companies are very much aware of the problems their big clients have, but they are rarely aware of the struggles of thousands of small startups.  

If you are a software engineer who use Postgres, and if you are frustrated with bloated tables, “autovacuum blocking your processes,” partitioned tables being slow, indexes not used, CPU being close to 100% most times, or your system running out of memory, chances are there are some easy fixes that will make your life better and your satisfaction with Postgres might reach the next level:).

Or maybe you were able to successfully resolve your Postgres problems, and now you think it’s embarrassing that you didn’t figure them out earlier. Don’t be embarrassed, be proud that you figured things out! And trust me, that very moment, dozens of developers are struggling with the problem you successfully resolved.

In either case, it would be totally worth your time if you join us on April 14!  And if you think that Claude will solve all your problems, it’s only partially true. Talking with Claude will never substitute talking

[...]

Good CTE, bad CTE
Posted by Radim Marek on 2026-03-29 at 12:47

CTEs are often the first feature developers reach for beyond basic SQL, and often the only one.

But the popularity of CTEs usually has less to do with modernizing code and more to do with the promise of imperative logic. For many, CTE acts as an easy to understand remedy for 'scary queries' and way how to force execution order on the database. The way how many write queries is as if they tell optimizer "first do this, then do that".

This creates a problem. CTEs handle query decomposition, recursion and multi statement DDLs. Planner treats them differently depending how you write and use them though. For long time (prior PostgreSQL 12) CTEs acted as optimization fence. The planner couldn't push predicates into them, couldn't use indexes on the underlying tables. Couldn't do anything that materialize them and scan through the result.

PostgreSQL 12 changed this. CTEs now get inlined, materialized, or something in between, depending on how you write them.

Sample schema

We will use the same schema as in the article PostgreSQL Statistics: Why queries run slow.

CREATE TABLE customers (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE orders (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id integer NOT NULL REFERENCES customers(id),
    amount numeric(10,2) NOT NULL,
    status text NOT NULL DEFAULT 'pending',
    note text,
    created_at date NOT NULL DEFAULT CURRENT_DATE
);

CREATE TABLE orders_archive (LIKE orders INCLUDING ALL EXCLUDING IDENTITY);

INSERT INTO customers (name)
SELECT 'Customer ' || i
FROM generate_series(1, 2000) AS i;

INSERT INTO orders (customer_id, amount, status, note, created_at)
SELECT
    (random() * 1999 + 1)::int,
    (random() * 500 + 5)::numeric(10,2),
    (ARRAY['pending','shipped','delivered','cancelled'])[floor(random()*4+1)::int],
    CASE WHEN random() < 0.3 THEN 'Some note text here for padding' ELSE NULL END,
    '2022-01-01'::date + (random() * 1095)::int
FROM generate_series(1, 10000
[...]

Waiting for PostgreSQL 19 – json format for COPY TO
Posted by Hubert 'depesz' Lubaczewski on 2026-03-29 at 12:34
On 20th of March 2026, Andrew Dunstan committed patch: json format for COPY TO   This introduces the JSON format option for the COPY TO command, allowing users to export query results or table data directly as a stream of JSON objects (one per line, NDJSON style).   The JSON format is currently supported only … Continue reading "Waiting for PostgreSQL 19 – json format for COPY TO"

pg_duckpipe: What's New in March 2026
Posted by Yuwei Xiao on 2026-03-28 at 00:00
pg_duckpipe is a PostgreSQL extension for real-time CDC to DuckLake columnar tables. This month: transparent query routing, append-only changelog, fan-in streaming, partitioned table support, and more.

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.