Latest Blog Posts

Unpublished interview
Posted by Oleg Bartunov in Postgres Professional on 2025-12-23 at 12:54
I found a copy of an interview that apparently has not been published anywhere



Interview with Oleg Bartunov




“Making Postgres available in multiple languages was not my goal—I was just working on my actual task.”


Oleg Bartunov has been involved in PostgreSQL development for over 20 years. He was the first person to introduce locale support, worked on non-atomic data types in this popular DBMS, and improved a number of index access methods. He is also known as a Himalayan traveler, astronomer, and photographer.







— Many of your colleagues know you as a member of the PostgreSQL Global Development Group. How did you get started with Postgres? When and why did you join the PostgreSQL community?


— My scientific interests led me to the PostgreSQL community in the early 1990s. I’m a professional astronomer, and astronomy is the science of data. At first, I wrote databases myself for my specific tasks. Then I went to the United States—in California, I learned that ready-made databases existed, including open-source ones you could just take and use for free! At the University of Berkeley, I encountered a system suited to my needs. At that time, it was still Ingres, not even Postgres95 yet. Back then, the future PostgreSQL DBMS didn’t have many users—the entire mailing list consisted of 400 people, including me. Those were happy times.





— Does your experience as a PostgreSQL developer help you contribute to astronomy?


— We’ve done a lot for astronomy. Astronomical objects are objects on a sphere, so they have spherical coordinates. We introduced several specialized data types for astronomers and implemented full support for them—on par with numbers and strings.





— How did scientific tasks transform into business ones?


— I started using PostgreSQL to solve my astronomical problems. Gradually, people from outside my field began reaching out to me. In 1994, it became necessary to create a digital archive for The Teacher’s Newspaper, an

[...]

Don't give Postgres too much memory (even on busy systems)
Posted by Tomas Vondra on 2025-12-23 at 10:00

A couple weeks ago I posted about how setting maintenance_work_mem too high may make things slower. Which can be surprising, as the intuition is that memory makes things faster. I got an e-mail about that post, asking if the conclusion would change on a busy system. That’s a really good question, so let’s look at it.

To paraphrase the message I got, it went something like this:

Lower maintenance_work_mem values may split the task into chunks that fit into the CPU cache. Which may end up being faster than with larger chunks.

PostgreSQL Column Limits
Posted by Umair Shahid in Stormatics on 2025-12-23 at 08:03

The 1,600‑column ceiling, and the real production problems that show up long before it

If you’ve ever had a deployment fail with “tables can have at most 1600 columns”, you already know this isn’t an academic limit. It shows up at the worst time: during a release, during a migration, or right when a customer escalation is already in flight.

But here’s the more common reality: most teams never hit 1,600 columns; they hit the consequences of wide tables first:

  • Query latency creeps up (more I/O per row, less cache efficiency)
  • WAL volume and replication lag increase (updates write new row versions)
  • Backups get bigger, restores get slower
  • Schema changes become scarier (locks + rewrites + “what else will break?”)

This post explains what PostgreSQL’s column limits really are, how to spot risk early, and what the limits imply

“We didn’t design a 1,600‑column table, but PostgreSQL says we did”

This problem tends to arrive in one of three ways:

1. The table got wide naturally

  • You add “just one more feature flag” every sprint
  • You support more integrations, more optional attributes, more “customer‑specific” fields
  • Your ORM makes it easy to keep shoving columns into a single “God” table

2. You dropped columns, so you think you’re safe
You look at \d my_table and see 400 columns. But PostgreSQL refuses to add column #401 because dropped columns still count toward the limit. That’s not folklore. It’s documented behavior. (Ref: https://www.postgresql.org/docs/current/limits.html)

3. You didn’t hit the table limit, you hit the query limit
Even if no single table has 1,600 columns, a SELECT * across several wide tables (or a wide view) can hit the result set column limit (1,664) and fail in surprising way

[...]

PostgreSQL Santa’s Naughty Query List: How to Earn a Spot on the Nice Query List?
Posted by Mayur B. on 2025-12-23 at 07:01

Santa doesn’t judge your SQL by intent. Santa judges it by execution plans, logical io, cpu utilization, temp usage, and response time.

This is a practical conversion guide: common “naughty” query patterns and the simplest ways to turn each into a “nice list” version that is faster, more predictable, and less likely to ruin your on-call holidays.

1) Naughty: SELECT * on wide tables (~100 columns)

Why it’s naughty

  • Wider tuples cost more everywhere: more memory bandwidth, more cache misses, bigger sort/hash entries, more network payload.
  • Index-only becomes impossible: if you request 100 columns, the planner can’t satisfy the query from a narrow index, so you force heap fetches.
  • You pay a bigger “spill tax”: wide rows make sorts/aggregations spill to disk sooner.

Extra naughty in the HTAP era

Postgres is increasingly “one database, multiple workloads.” Columnar/analytics options (e.g., Orioledb, Hydra Columnar; DuckDB-backed columnstore/engine integrations) make projection discipline even more decisive, because columnar execution reads only the referenced columns so selecting fewer columns directly reduces I/O and CPU.

Nice list fixes

  • Always select only what you need.

Santa verdict: If you didn’t need the column, don’t fetch it, don’t carry it, don’t ship it.

2) Naughty: WHERE tenant_id = $1 ... ORDER BY x LIMIT N that hits the “ORDER BY + LIMIT optimizer quirk”

Why it’s naughty

This is a classic planner trap: the optimizer tries to be clever and exploit an index that matches ORDER BY, so it can stop early with LIMIT. But if the filtering predicate is satisfied “elsewhere” (joins, correlations, distribution skew), Postgres may end up scanning far more rows than expected before it finds the first N that qualify. That’s the performance cliff.

[...]

PostgreSQL Performance: Latency in the Cloud and On Premise
Posted by Hans-Juergen Schoenig in Cybertec on 2025-12-23 at 05:00

PostgreSQL is highly suitable for powering critical applications in all industries. While PostgreSQL offers good performance, there are issues not too many users are aware of but which play a key role when it comes to efficiency and speed in general. Most people understand that more CPUs, better storage, more RAM and alike will speed up things. But what about something that is equally important?

We are of course talking about “latency”.

What does latency mean and why does it matter?

The time a database needs to execute a query is only a fraction of the time the application needs to actually receive an answer. The following image shows why:

When the client application sends a request, it will ask the driver to send a message over the wire to PostgreSQL (a), which then executes the query (b) and sends the result set back to the application (c). The core question now is: Are (a) and (c) relative to (b) relevant? Let us find out and see. First, we will initialize a simple test database with pgbench. For the sake of this test, a small database is totally sufficient:

cybertec$ pgbench -i blog 
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...                                                                              
creating primary keys...
done in 0.19 s (drop tables 0.00 s, create tables 0.02 s, client-side generate 0.13 s, vacuum 0.02 s, primary keys 0.02 s).

Now, let us run a first simple test. What it does is to start a single UNIX socket connection and just run for 20 seconds (a read-only test):

cybertec$ pgbench -c 1 -T 20 -S blog
pgbench (17.5)
starting vacuum...end.
transaction type: 
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tri
[...]

Instant database clones with PostgreSQL 18
Posted by Radim Marek on 2025-12-22 at 23:30

Have you ever watched long running migration script, wondering if it's about to wreck your data? Or wish you can "just" spin a fresh copy of database for each test run? Or wanted to have reproducible snapshots to reset between runs of your test suite, (and yes, because you are reading boringSQL) needed to reset the learning environment?

When your database is a few megabytes, pg_dump and restore works fine. But what happens when you're dealing with hundreds of megabytes/gigabytes - or more? Suddenly "just make a copy" becomes a burden.

You've probably noticed that PostgreSQL connects to template1 by default. What you might have missed is that there's a whole templating system hiding in plain sight. Every time you run

CREATE DATABASE dbname;

PostgreSQL quietly clones standard system database template1 behind the scenes. Making it same as if you would use

CREATE DATABASE dbname TEMPLATE template1;

The real power comes from the fact that you can replace template1 with any database. You can find more at Template Database documentation.

In this article, we will cover a few tweaks that turn this templating system into an instant, zero-copy database cloning machine.

CREATE DATABASE ... STRATEGY

Before PostgreSQL 15, when you created a new database from a template, it operated strictly on the file level. This was effective, but to make it reliable, Postgres had to flush all pending operations to disk (using CHECKPOINT) before taking a consistent snapshot. This created a massive I/O spike - a "Checkpoint Storm" - that could stall your production traffic.

Version 15 of PostgreSQL introduced new parameter CREATE DATABASE ... STRATEGY = [strategy] and at the same time changed the default behaviour how the new databases are created from templates. The new default become WAL_LOG which copies block-by-block via the Write-Ahead Log (WAL), making I/O sequential (and much smoother) and support for concurrency without facing latency spike. This prevented the need to CHECKPOINT but made the databa

[...]

Contributions for week 52, 2025
Posted by Cornelia Biacsics in postgres-contrib.org on 2025-12-22 at 19:08

Pavlo Golub gave a talk at WaW Tech conference in Warsaw on Dec 16 2025

Hyderabad PostgreSQL UserGroup Meetup on Dec 19 2025. Organised by Hari Kiran.

Speakers:

  • Shameer Bhupathi
  • Jagadish Kantubugata
  • Jeevan DC
  • Uma Shankar PB
  • Uttam Samudrala

The PostgreSQL User Group meeting in Islamabad wasn’t mentioned earlier. The meetup took place on December 6 and was organized by Umair Shahid, who also gave a talk at the event.

Community Blog Posts

PostgreSQL Contributor Story: Mario Gonzalez
Posted by Floor Drees in EDB on 2025-12-22 at 13:47
Earlier this year we started a program (“Developer U”) to help colleagues who show promise for PostgreSQL Development to become contributors.

What happened?
Posted by Devrim GÜNDÜZ in EDB on 2025-12-22 at 08:22
This is an incident report, a post-mortem writeup and detailed info how we started supporting multiple RHEL minor versions.

Last month PostgreSQL RPM repos were broken for Rocky Linux and AlmaLinux 9 and 10 users due to an OpenSSL update that Red Hat pushed to versions 10.1 and 9.7, which broke backward compatibility. Actually I broke the repos. Continue reading "What happened?"

fresh dll of orafce and plpgsql_check for PostgreSQL 17 and PostgreSQL 18
Posted by Pavel Stehule on 2025-12-20 at 06:25

I compiled and uploaded zip files with latest orafce and plpgsql_check for PostgreSQL 17 and PostgreSQL 18 - I used Microsoft Visual C 2022.

Setup:

  1. download orafce-4.16.3-x86_64-windows.zip or plpgsql_check-2.8.5-x86_64-windows.zip and extract files
  2. copy related dll file to PostgreSQL lib directory (NN is number of pg release)
    orafce-NN.dll -> "c:\Program Files\PostgreSQL\NN\lib"
  3. remove suffix "x64-16" or "x64-17" from dll file
    orafce-NN.dll -> orafce.dll
  4. copy *.sql and *.control files to extension directory
    *.sql, *.control -> "c:\Program Files\PostgreSQL\NN\share\extension"
  5. execute with super user rights SQL command CREATE EXTENSION
    CREATE EXTENSION orafce;




The OOM-Killer Summoning Ritual: “Just Increase work_mem”
Posted by Mayur B. on 2025-12-19 at 21:57

You’ve probably seen the incident pattern:

  1. Postgres backends start disappearing.
  2. dmesg / journalctl -k shows the kernel OOM killer reaping postgres.
  3. Someone spots “out of memory” and reflexively recommends: “Increase work_mem.”

That recommendation is frequently backwards for OS OOM kills.

The linguistic trap: “Out of memory” sounds like “not enough work_mem”

work_mem is not “memory for the query.” It is a base, per-operation budget for executor nodes like sorts and hash tables before they spill to temporary files. PostgreSQL’s own docs explicitly warn that a complex query can run multiple sort/hash operations concurrently, and many sessions can do this at the same time so total memory can be many times work_mem.

If you raise work_mem globally, you are raising the ceiling on many potential concurrent memory consumers. That can turn “rare spike” into “frequent OOM kill.”

OS OOM kill vs “Postgres OOM”: two different failure modes

There are two scenarios people accidentally conflate:

  • Executor spills to disk (healthy): Postgres hits a per-node memory budget and writes temp files.
  • Kernel OOM kill (host-level failure): Linux cannot satisfy memory demands (often influenced by overcommit behavior) and terminates a process to keep the system alive.

In other words: if the kernel is killing Postgres, “give Postgres permission to use even more memory next time” is not a stabilizing strategy.

The source says what the docs say: work_mem is “allowed memory,” then temp files

The tuplesort implementation is blunt about the design: it keeps tuples in memory up to the limit and then switches to temporary “tapes” (temp files) for an external sort.

A tiny excerpt from src/backend/utils/sort/tuplesort.c (Doxygen):

/* ... memory allowe
[...]

Code Signing fun and games for pgAdmin
Posted by Dave Page in pgEdge on 2025-12-19 at 12:53
For years now pgAdmin builds on Windows have been code signed using a certificate installed on a build machine in AWS. That certificate is expiring, so I had to look for a new one (I no longer work at a company that supports Windows that can provide one), and much to my annoyance I found that requirements have changed and a secure physical device is required for key storage. Keys are now generally much more expensive, and whilst there are options to have them installed in virtual keystores at AWS or similar, that adds thousands of dollars to the cost(!) for a supervised installation. Thankfully, Certum offer an Open Source Developer option, which is much more reasonably priced, albeit only valid for a year and only offering basic levels of assurance.

The card and reader arrived this week, and once I was approved, a certificate issued. The first problem I hit was that you cannot run the card manager application via Remote Desktop - you have to do it on the machine's console, or via VNC or similar. I'm running on Proxmox (having passed through the card reader device to the VM), so that meant using the console in the Proxmox UI.

Once the card manager was running, I was able to install the key through the Certum website. So far, so good - until I tried to sign a test binary. That's when the fun began. After a morning of back-and-forth, working with Claude Desktop, we finally got it working with a test binary. In the hope that it will help others in the future (including myself and the rest of the pgAdmin team when the certificate expires in a year, the Claude-generated notes on what we did are below.

My next steps are to figure out how to automate PIN entry, and then incorporate all the changes into the pgAdmin CI/CD pipelines so I don't have to enter a PIN multiple times every day when the team commit changes. Fun times!

Certum Open Source Code Signing Certificate - Windows Setup Reference

Date: December 2025
Hardware: Certum cryptographic card with ACS ACR40T ICC Reader


[...]

pgEdge-Support-for-Large-Object-Logical-Replication
Posted by Ahsan Hadi in pgEdge on 2025-12-19 at 12:12

As AI capabilities continue to evolve and integrate more deeply into our applications, we’re faced with interesting architectural decisions about how to expose our data to large language models (LLMs). Two approaches that have gained significant traction are Retrieval Augmented Generation (RAG) servers (such as pgEdge RAG Server) and Model Context Protocol (MCP) servers (such as pgEdge Natural Language Agent). Both have their place, but they serve quite different purposes and come with vastly different security implications – particularly when it comes to database access.

What is a RAG Server?

RAG servers are designed to enhance LLM responses by providing relevant context from a knowledge base. The basic flow is straightforward: when a user asks a question, the RAG server searches for relevant documents or data chunks, retrieves them, and passes them to the LLM along with the original question. The model then generates a response based on both its training and the provided context.The key characteristic of a RAG server is that it acts as a carefully controlled intermediary. The server’s API defines exactly what operations are possible, what data can be retrieved, and how that data is formatted before being passed to the model. The LLM never directly touches your database; it only sees what the RAG server chooses to show it.

What is an MCP Server?

MCP (Model Context Protocol) servers take a fundamentally different approach. Rather than providing pre-defined retrieval operations, MCP exposes a set of tools that the LLM can invoke directly. In the context of database access, this might include tools to execute SQL queries, browse schemas, or interact with stored procedures.The power of MCP lies in its flexibility. Instead of being limited to whatever retrieval logic was baked into a RAG server, an LLM connected to an MCP server can dynamically construct queries based on what it needs. This makes it exceptionally useful for exploratory data analysis, ad-hoc reporting, and other scenarios where the questions[...]

RAG Servers vs MCP Servers: Choosing the Right Approach for AI-Powered Database Access
Posted by Dave Page in pgEdge on 2025-12-19 at 11:31

As AI capabilities continue to evolve and integrate more deeply into our applications, we’re faced with interesting architectural decisions about how to expose our data to large language models (LLMs). Two approaches that have gained significant traction are Retrieval Augmented Generation (RAG) servers (such as pgEdge RAG Server) and Model Context Protocol (MCP) servers (such as pgEdge Natural Language Agent). Both have their place, but they serve quite different purposes and come with vastly different security implications – particularly when it comes to database access.

What is a RAG Server?

RAG servers are designed to enhance LLM responses by providing relevant context from a knowledge base. The basic flow is straightforward: when a user asks a question, the RAG server searches for relevant documents or data chunks, retrieves them, and passes them to the LLM along with the original question. The model then generates a response based on both its training and the provided context.The key characteristic of a RAG server is that it acts as a carefully controlled intermediary. The server’s API defines exactly what operations are possible, what data can be retrieved, and how that data is formatted before being passed to the model. The LLM never directly touches your database; it only sees what the RAG server chooses to show it.

What is an MCP Server?

MCP (Model Context Protocol) servers take a fundamentally different approach. Rather than providing pre-defined retrieval operations, MCP exposes a set of tools that the LLM can invoke directly. In the context of database access, this might include tools to execute SQL queries, browse schemas, or interact with stored procedures.The power of MCP lies in its flexibility. Instead of being limited to whatever retrieval logic was baked into a RAG server, an LLM connected to an MCP server can dynamically construct queries based on what it needs. This makes it exceptionally useful for exploratory data analysis, ad-hoc reporting, and other scenarios where the questions[...]

Dev Container for pgrx PostgreSQL Extensions: Lessons Learned
Posted by Pavlo Golub in Cybertec on 2025-12-19 at 06:00

I like reproducible development. I also like short feedback loops. Combining both for pgrx was… educational. 🙂 In this post, I share the mistakes, the small pains, and the fixes I used to get a working VS Code dev container for a Rust project that builds PostgreSQL extensions with pgrx. If you’re writing extensions or using pgrx in a team, this will save you a few grey hairs.

TL;DR:

  • PostgreSQL refuses to run as root. Don’t run your devcontainer as root if cargo pgrx test must work.
  • Install user-specific Rust tools as the non-root user, remember that order matters.
  • Run cargo pgrx init after the container starts (use postCreateCommand) so the config persists.

A quick story

I was setting up a devcontainer for an etcd_fdw project that uses pgrx to produce a PostgreSQL extension. At first, it seemed straightforward: start from the official Rust image, install build deps, add cargo-pgrx, and be done.

But real life is noisy. I hit permission errors, failing tests, and an annoying config.toml not found. Have you run 'cargo pgrx init' yet?. After some digging, I discovered three core issues that you should watch out for.

The PostgreSQL root problem

PostgreSQL will not run as root by design. cargo pgrx test launches PostgreSQL instances for integration testing. If your devcontainer runs as root (or if you initialize pgrx as root and then switch user incorrectly), tests will fail.

First, I tried to set remoteUser: "root" in devcontainer.json and installing everything as root. The build failed with errors when tests attempted to run PostgreSQL.

The fix is (simplified Dockerfile snippet):

# Create a non-root user early in the image build
RUN useradd -m -s /bin/bash -u 1000 vscode && \
echo "vscode ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

# Switch to that user for subsequent steps that create user-owned files
USER vscode

The key takeaway is to think about runtime actions (what cargo pgrx test will do) when you design the container user. Create and use a non-root user earl

[...]

🐏 Taming PostgreSQL GUC “extra” Data
Posted by David Wheeler on 2025-12-18 at 18:04

New post up on on the ClickHouse blog:

I wanted to optimize away parsing the key/value pairs from the pg_clickhouse pg_clickhouse.session_settings GUC for every query by pre-parsing it on assignment and assigning it to a separate variable. It took a few tries, as the GUC API requires quite specific memory allocation for extra data to work properly. It took me a few tries to land on a workable and correct solution.

Struggling to understand, making missteps, and ultimately coming to a reasonable design and solution satisfies me so immensely that I always want to share. This piece gets down in the C coding weeds; my fellow extension coders might enjoy it.

Enhancing PostgreSQL OIDC with pg_oidc_validator
Posted by Jan Wieremjewicz in Percona on 2025-12-17 at 11:00

With PostgreSQL 18 introducing built-in OAuth 2.0 and OpenID Connect (OIDC) authentication, tools like pg_oidc_validator have become an essential part of the ecosystem by enabling server-side verification of OIDC tokens directly inside PostgreSQL. If you’re new to the topic, make sure to read our earlier posts explaining the underlying concepts and the need for external validators:

This release builds on the initial version announced in October and continues our mission to make OIDC adoption in PostgreSQL reliable, fast, and accessible for all users.

The Road to Deploy a Production-Grade, Highly Available System with Open-Source Tools
Posted by semab tariq in Stormatics on 2025-12-17 at 10:48

Everyone wants high availability, and that’s completely understandable. When an app goes down, users get frustrated, business stops, and pressure builds.

But here’s the challenge: high availability often feels like a big monster. Many people think, If I need to set up high availability, I must master every tool involved. And there’s another common belief too: Open-source tools are not enough for real HA, so I must buy paid tools.

These assumptions make high availability seem far more complex than it really is, and in this series, we are going to address it.

This Is a 2-Part Series.

  • Part 1 (this one): We will lay the foundation by answering the most important questions you should consider before going hands-on with HA systems.
  • Part 2: We will go fully hands-on. I will walk through the architecture diagram, the tool stack, and provide the exact commands and step-by-step instructions to deploy the cluster based on your requirements.

The “Number of Nines”, RTO, and RPO

These are the foundations of a high-availability cluster. If you understand them and answer them clearly, you are already very close to building your HA setup.

Imagine you have a main site (your primary system). Things are working fine. Life is good. But one day, a disaster happens, maybe a server fails, a region goes down, or your database crashes.

At that moment, three questions decide everything.

1) How much downtime can you accept?

This is where the number of nines comes in (like 99.9% uptime, 99.99%, and so on). More nines usually mean less downtime, but also more effort and architectural cost.

Here’s a simple table to help you choose the right architecture based on your needs.

Target uptime (“nines”) Allowed downtime (per year) A setup that usually fits Notes / what y
[...]

pgBackRest preview - simplifying manual expiration of oldest backups
Posted by Stefan Fercot in Data Egret on 2025-12-17 at 10:35

A useful new feature was introduced on 11 December 2025: Allow expiration of the oldest full backup regardless of current retention. Details are available in commit bf2b276.

Before this change, it was awkward to expire only the oldest full backup while leaving the existing retention settings untouched. Users had to temporarily adjust retention (or write a script) to achieve the same result. Expiring the oldest full backup is particularly helpful when your backup repository is running low on disk space.

Let’s see how this works in practice with a simple example.


Example of expiring the oldest full backup

Let’s use a very basic demo setup.

$ pgbackrest info
...
wal archive min/max (18): 000000010000000000000056/000000010000000300000038

full backup: 20251217-095505F
    wal start/stop: 000000010000000000000056 / 000000010000000000000056

incr backup: 20251217-095505F_20251217-095807I
    wal start/stop: 00000001000000010000001B / 00000001000000010000001B
    backup reference total: 1 full

full backup: 20251217-095902F
    wal start/stop: 000000010000000100000090 / 000000010000000100000090

incr backup: 20251217-095902F_20251217-100124I
    wal start/stop: 000000010000000200000013 / 000000010000000200000013
    backup reference total: 1 full

This gives us two full backups, each of them with a linked incremental backup. Note that I have deliberately reduced the output of the info command to the information relevant to our test case.

$ pgbackrest expire --stanza=demo --dry-run
P00   INFO: [DRY-RUN] expire command begin 2.57.0: ...
P00   INFO: [DRY-RUN] repo1: 18-1 no archive to remove
P00   INFO: [DRY-RUN] expire command end: completed successfully

If we run the expire command, nothing is expired, as I have set repo1-retention-full=2, meaning that two full backups should be kept.

Let’s now imagine that your backup repository is running out of disk space and you want to remove the oldest backup to free up some space. How would you do that?

Once you have identified the back

[...]

Introducing The pgEdge Postgres MCP Server
Posted by Phillip Merrick in pgEdge on 2025-12-17 at 07:02

One of the principal and most powerful components of the pgEdge Agentic AI Toolkit is the pgEdge Postgres MCP Server. In just over a year MCP (Model Context Protocol), initially developed by Anthropic, has become the standard way to connect LLMs to external data sources and tools. Some people describe it as being like a USB for LLMs. The pgEdge Postgres MCP server makes it extremely easy to connect Claude Code, Claude Desktop, Cursor and other AI development tools directly to Postgres database – not just pgEdge distributions, but standard community Postgres, Amazon RDS, and pretty much any other relatively standard version of Postgres (so long as it is v14 or newer).Let’s walk through how to set up the pgEdge Postgres MCP Server, what makes it different, and how to use it with Claude Desktop and AI code generators like Claude Code and Cursor. Our discussion is somewhat Claude-centric, but the pgEdge MCP Server also works with OpenAI GPT-5 and local models such as Ollama.

What Makes Our MCP Server for Postgres Different

Perhaps your first reaction to seeing our announcement was “wait, what? Another Postgres MCP Server”.  But it turns out until today there was no dedicated Postgres vendor offering a fully featured and fully supported MCP Server that works with all your existing Postgres databases.  Most of the available Postgres MCP Servers are tied to the vendor's own products, and in particular their cloud database offering.The pgEdge MCP Server provides flexible deployment options: on-premises, in self managed cloud accounts or soon in our pgEdge Cloud managed cloud service. Additionally, when used with pgEdge Distributed Postgres, applications making use of the pgEdge MCP Server can meet enterprise requirements for high availability, multi-region failover and data sovereignty.  Most Postgres MCP servers out there give you basic read-only access—enough to query schemas and run SELECT statements. We built ours to be production-grade from day one: - The MCP server doesn't just list your tables. It pulls det[...]

Building AI Agents on Postgres: Why We Built the pgEdge Agentic AI Toolkit
Posted by Phillip Merrick in pgEdge on 2025-12-16 at 17:57

We are delighted today to be announcing the beta release of our pgEdge Agentic AI Toolkit for Postgres.  We’ve had the benefit of collaborating on real-world Postgres-based AI applications the past two years with a number of leading customers, and this product announcement is the outgrowth of this learning.We listened to customers as they refined their AI strategies in response to the rapid evolution of LLMs, Agentic AI and integration technologies such as the Model Context Protocol (MCP), and as we did so a few things stood out to us.First and foremost, many of the newly available tools and technologies are not suited to the needs of the enterprise, particularly in highly regulated industries or major government agencies.   Many of the new AI application builders and code generators – and the database platforms supporting them – do not adequately address enterprise requirements for high availability, data sovereignty, global deployment, security and compliance and the need in some cases to run on-premises or in self-managed cloud accounts.  As one CIO in financial services put it to us recently: “We’ve got a couple of dozen AI generated applications end users really want to put into production, but first we’ve got to figure out how to deploy them on our own internal compliant infrastructure.”Secondly, as compelling as it is to automate workflows with Agentic AI, or to generate new applications with tools like Claude Code, Replit, Cursor or Lovable, the biggest need is to work with existing databases and applications.  While newer Postgres-based cloud services work well with Agentic AI and AI app builders for brand new applications, they cannot accommodate existing databases and applications without a costly migration. And such a migration may well be to an environment that doesn’t meet the organization’s strict security and compliance requirements. Enterprise customers need AI tooling – including an MCP Server – that can operate against their existing databases.Additionally we saw there was no dedicated Postg

[...]

Which indexes can be corrupted after an operating system upgrade?
Posted by Laurenz Albe in Cybertec on 2025-12-16 at 06:00

Dialog between two men: "We had some corruption after the operating system upgrade." — "What, Windows 11 and a different codepage in Notepad?" — "No, PostgreSQL and a natural language collation."
© Laurenz Albe 2025

Most major Linux distributions have upgraded to the GNU C library version 2.28 or later. Therefore, there is a growing awareness that an operating system upgrade can lead to index corruption. However, rebuilding all your indexes can slow down the upgrade process considerably. In this article, I want to discuss how you can avoid rebuilding more indexes than necessary.

A management summary of the problem

For natural language collation support (the rules to compare and sort strings in natural languages), PostgreSQL largely relies on external libraries. Depending on how how you ran initdb, that could be the C library or the ICU library. Sometimes bugfixes in the new library versions that you install during an operating system upgrade lead to a change in a collation. Since database indexes are pre-sorted data structures, a change in the rules for sorting strings can render such indexes to be sorted incorrectly. This index corruption can lead to wrong query results and violated unique constraints.

The solution is to rebuild an affected index with the REINDEX command.

Keeping the down time for an operating system upgrade short

For many users, it is important that any down time of the database is as short as possible. For such users, the need to rebuild indexes is a problem. If you want to play it safe, you cannot start working with the database before rebuilding the indexes is completed. You could also take a more relaxed attitude and risk starting the application right away. Then you would use REINDEX INDEX CONCURRENTLY to rebuild problematic indexes while your application is running. The risk is that your index scans may return wrong results for a while, and few people are willing to take such a risk.

One way to perform an operating system upgrade that avoids the problem of corrupted indexes and can keep the down time very short is logical replication. Here, the subscriber builds its indexes based on logical information (similar to SQL statements). Consequently, th

[...]

From PGDays to PGCon/fs
Posted by Valeria Kaplan in Data Egret on 2025-12-15 at 22:51

A Journey Through PostgreSQL Events

• Introduction
• Community recognition (transparency, inclusivity, organisational balance)
• If I organise a PostgreSQL event — does it have to be recognised?
• Are community-recognised events better than those that aren’t?
• Conferences — a quick flyover (pgDays, pgConfs, FOSDEM,
PGConf.dev etc.)
• PostgreSQL User Groups / meet-ups
• Key takeaways
• Which PostgreSQL event to attend?

Introduction

Working at a PostgreSQL-focused company for over 10 years takes you places. Specifically, it takes you to PostgreSQL community events that are very different than your standard commercial conference.

The dynamic of community events is different. They feel like a village fête, where people have known each other for ages. There’s familiarity, shared purpose, and a genuine sense of belonging.

At the same time, there are many different Postgres-focused events — community and commercial ones, small meet-ups, bigger pgDays, and even larger pgConfs - each with a slightly different audience and scope in mind.

So when I’ve been recently asked, “Which Postgres conference would you recommend going to?” I realised this might be a good topic for a post.

A quick note —here, I’ll mention a handful of events, predominantly in Europe, but there are, of course, many others Postgres-focused events out-there. From PG Down Under in Australia to pgDay Mexico to PGConf NYC 2025 to growing user group meetings in Madagascar and Armenia, and the virtual ones, such as POSETTE and Postgres meetup for all— Postgres is definitely having its momentum.

When you see 🔔 below, those are actions or deadlines worth noting.

Community recognition

Majority of PostgreSQL conferences can be found on PostgreSQL.org events page and of those some are community recognised (note a small gold medal next to their titles) while others aren’t.

screenshot of conferences listed on postgresql.org. Some conferences have small golden medlas next to them.
Source postgresql.org

Community event recognition indicates that an event complies with the PostgreSQL event recog

[...]

Improved Markdown Parsing
Posted by David E. Wheeler on 2025-12-15 at 15:55

Quick announcement to say that I’ve replaced the ancient markdown parser with a new one, discount, which supports tables, code fences, definition lists, and more. I reindexed pg_clickhouse this morning and it’s sooo nice to see the table properly formatted.

New uploads will use this parser for Markdown (but not MultiMarkdown) files from now on. I think I’ll start working on reindexing all existing extensions, too. Give me a holler if you don’t see an improvement in your extensions in the next few days.

PostgreSQL 18 Asynchronous Disk I/O – Deep Dive Into Implementation
Posted by Josef Machytka in credativ on 2025-12-15 at 09:53

AI generated image symbolizing an asynchronous running databasePostgreSQL 17 introduced streaming I/O – grouping multiple page reads into a single system call and using smarter posix_fadvise() hints. That alone gave up to ~30% faster sequential scans in some workloads, but it was still strictly synchronous: each backend process would issue a read and then sit there waiting for the kernel to return data before proceeding. Before PG17, PostgreSQL typically read one 8kB page at a time.

PostgreSQL 18 takes the next logical step: a full asynchronous I/O (AIO) subsystem that can keep multiple reads in flight while backends keep doing useful work. Reads become overlapped instead of only serialized. The AIO subsystem is deliberately targeted at operations that know their future block numbers ahead of time and can issue multiple reads in advance:
  • Heap sequential scans, like plain SELECT and COPY operations that stream lots of data
  • VACUUM on big tables and indexes
  • ANALYZE sampling
  • Bitmap heap scans

Autovacuum benefits from this change too, since its workers share the same VACUUM/ANALYZE code paths. Other operations still remain synchronous for now:

  • B‑tree index scans / index‑only scans
  • Recovery & replication
  • All write operations INSERT, UPDATE, DELETE, WAL writes
  • Small OLTP lookups that touch a single heap page

Future work is expected to widen coverage, especially index‑only scans and some write‑path optimizations.

Significant improvements for cloud volumes

Community benchmarks show that PostgreSQL 18 AIO significantly improves cold cache data reads in cloud setups with network‑attached storage where latency is high. AWS documentation states that average latency of Block Express volumes is “under 500 microseconds for 16 KiB I/O size”, when latency of General Purpose volumes can exceed 800 microseconds. Some articles suggest that under high load each physical block read from disk can cost around 1ms, while page processing in PostgreSQL is much cheaper. By combining many pages into one read, all

[...]

PostgreSQL Contributor Story: Nishant Sharma
Posted by Floor Drees in EDB on 2025-12-15 at 07:56
Earlier this year we started a program (“Developer U”) to help colleagues who show promise for PostgreSQL Development to become contributors. Meet: Nishant Sharma, Staff SDE, who maintains a list of places in the world he has yet to visit.

Anonymising PII in PostgreSQL with pgEdge Anonymizer
Posted by Dave Page in pgEdge on 2025-12-15 at 06:13

Data privacy regulations such as GDPR, CCPA, and HIPAA have made it increasingly important for organisations to protect personally identifiable information (PII) in their databases. Whether you're creating a development environment from production data, sharing datasets with third parties, or simply trying to minimise risk, you'll often need to anonymise sensitive data whilst maintaining the structure and relationships within your database.I've been working on a tool to address this need: . It's a command-line utility that replaces PII in PostgreSQL databases with realistic but fake values, all whilst preserving referential integrity and data consistency.

The Problem

Consider a typical scenario: you have a production database containing customer records, and you need to create a copy for your development team. The data includes names, email addresses, phone numbers, National Insurance numbers, and credit card details. You can't simply hand over the production data as that would be a compliance nightmare, but you also need the development database to contain realistic data that exercises the same code paths as production.Manually anonymising this data is tedious and error-prone. You need to ensure that:
  • The same customer email appears consistently across all tables
  • Foreign key relationships remain intact
  • The anonymised data looks realistic (not just "XXXX" or "test@test.com")
  • The process is repeatable and auditable

Enter pgEdge Anonymizer

pgEdge Anonymizer addresses these challenges with a simple YAML-based configuration approach. You define which columns contain PII and what type of data they hold, and the tool handles the rest.

Installation

Building from source is straightforward:This produces a single binary in the directory that you can copy wherever you need it.

Configuration

The configuration file defines your database connection and the columns to anonymise. Here's a typical example:Each column is specified using its fully-qualified name () and assigned a pattern [...]

Contributions for week 51, 2025
Posted by Cornelia Biacsics in postgres-contrib.org on 2025-12-14 at 20:57

On December 11, the PGDay CERN 2026 CfP committee met to finalize the talk selection. The committee members are listed here:

  • Tobias Bussmann, SCNAT (chair, non voting)
  • Abel Cabezas Alonso, CERN
  • Maurizio De Giorgi, CERN
  • Andreas Geppert, ZKB
  • Julia Gugel, migrolino
  • Tom Hagel, Volue
  • Svitlana Lytvynenko, CYBERTEC *Gülçin Yıldırım Jelinek, Xata

IT Tage Frankfurt Dec 8-11 2025 Workshops:

  • Hans-Jürgen Schönig

Talks:

  • Thomas Koch
  • Raphael Salguero Aragón
  • Robert Baric
  • Hans-Jürgen Schönig
  • Adrien Obernesser
  • Bernd Patolla
  • Michael Mühlbeyer
  • Daniel Westermann
  • Dirk Krautschick
  • Chris Norman Bischoff
  • Jonas Gassenmeyer
  • Markus Dahm
  • Dr. Veikko Krypczyk

Meetup Frankfurt on December 10 2025 organised by Ilya Kosmodemiansky and Diego Calvo de No

Speaker

  • Andreas Scherbaum
  • Dirk Aumueller

New podcast episode What Postgres developers can expect from PGConf.dev published by Claire Giordano from her series “Talking Postgres” with Melanie Plageman.

On December 11 2025, Jimmy Angelakos organized the first PostgreSQL Edinburgh meetup. Chris Ellis and Jimmy Angelakos spoke at this meetup.

The Swiss PostgreSQL Users had a Meetup in Zurich, speakers are Stefan Keller and Lars Herrmann.

The Talk Selection Committee from the Prague PostgreSQL Developer Day finalized the schedule for P2D2 - the team consists of:

  • Pavlo Golub
  • Pavel Hák
  • Hana Litavská
  • Teresa Lopes
  • Mayur B
  • Esther Miñano
  • Josef Šimánek
  • Pavel Stěhule
  • Tomáš Vondra

How does the PostgreSQL Buildfarm check upgrades across versions?
Posted by Andrew Dunstan in EDB on 2025-12-12 at 19:47

From time to time I see questions from otherwise well informed people about how the PostgreSQL Build farm checks how pg_upgrade checking is done across versions, e.g. how does it check upgrading from release 9.5 to release 18. I realize that this isn't well documented anywhere, so here is a description of the process.

All of the code referenced here can be found at https://github.com/PGBuildFarm/client-code.

The principal buildfarm client script is run_build.pl, which builds and tests a single branch (e.g. REL_17_STABLE). There is another script called run_branches.pl, which calls this run_build

Zero-Downtime PostgreSQL Maintenance with pgEdge
Posted by Antony Pegg in pgEdge on 2025-12-12 at 06:02

PostgreSQL maintenance doesn't have to mean downtime anymore. With pgEdge's zero-downtime node addition, you can perform critical maintenance tasks like version upgrades, hardware replacements, and cluster expansions without interrupting production workloads. Your applications stay online. Your users stay connected. Your business keeps running.This capability is available across both single-primary deployments (with integration of the open source extension Spock) and globally distributed deployments (by default), giving you the same operational advantages whether you're running a single-region deployment or a globally distributed system. Now, for an incredibly quick and easy approach to zero-downtime node addition, you can use the pgEdge Postgres Control Plane (hosted on GitHub). This approach provides drastically simplified management and orchestration of Postgres databases using a declarative API, whether you're running a single-primary or a globally distributed deployment of PostgreSQL clusters. Spock and other high-availability components come built-in accompanying community PostgreSQL for quick database administration with simple commands.And because pgEdge and all associated components are 100% open source under the PostgreSQL license, using 100% core community PostgreSQL, you get to leverage the high-availability components that enable zero-downtime maintenance without vendor lock-in or compatibility concerns.

What Is the Spock Extension?

Spock is pgEdge's advanced logical replication extension for PostgreSQL that enables active-active (multi-master) replication in clusters with row filtering, column projection, conflict handling, and more. Even though Spock originated from earlier projects like pgLogical and BDR 1, it has seen enormous growth. Our dedicated team of PostgreSQL experts continues to push Spock forward, making it a high-performance, enterprise-grade replication system built for distributed environments.

Zero-Downtime Node Addition: Maintenance Without Interruption

Adding a new node [...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.