Latest Blog Posts

Building Ask Ellie: A RAG Chatbot Powered by pgEdge
Posted by Dave Page in pgEdge on 2026-02-19 at 07:59

If you've visited the pgEdge documentation site recently, you may have noticed a small elephant icon in the bottom right corner of the page. That's Ask Ellie; our AI-powered documentation assistant, built to help users find answers to their questions about pgEdge products quickly and naturally. Rather than scrolling through pages of documentation, you can simply ask Ellie a question and get a contextual, accurate response drawn directly from our docs.What makes Ellie particularly interesting from an engineering perspective is that she's built on PostgreSQL and pgEdge's ecosystem of extensions and tools, and she serves as both a useful tool for our users and a real-world demonstration of what you can build on top of PostgreSQL when you pair it with the right components. In this post, I'll walk through how we built her and the technologies that power the system.

The Architecture at a Glance

At its core, Ask Ellie is a Retrieval Augmented Generation (RAG) chatbot. For those unfamiliar with the pattern, RAG combines a traditional search step with a large language model to produce answers that are grounded in actual source material, rather than relying solely on the LLM's training data. This is crucial for a documentation assistant, because we need Ellie to give accurate, up-to-date answers based on what's actually in our docs, not what the model happens to remember from its training set.The architecture breaks down into several layers:
  • Content ingestion
  • : crawling and loading documentation into PostgreSQL
  • Embedding and chunking
  • : automatically splitting content into searchable chunks and generating vector embeddings
  • Retrieval and generation
  • : finding relevant chunks for a user's query and generating a natural language response
  • Frontend
  • : a chat widget embedded in the documentation site that streams responses back to the user
Let's look at each of these in turn.

Loading the Documentation

The first challenge with any RAG system is getting your content into a form[...]

Hacking Workshop for March 2026
Posted by Robert Haas in EDB on 2026-02-19 at 02:43

For next month's hacking workshop, I'm scheduling 2 or 3 discussions of Tomas Vondra's talk, Performance Archaeology, given at 2024.PGConf.EU. If you're interested in joining us, please sign up using this form and I will send you an invite to one of the sessions. Thanks as always to Tomas for agreeing to attend the sessions.

Read more »

PostgreSQL minor release postponed in Q1’ 2026
Posted by Jan Wieremjewicz in Percona on 2026-02-18 at 11:00

In case you are awaiting the February PostgreSQL Community minor update released on plan on February 12 we want to make sure that our users and customers are up to date and aware of what to expect.

This scheduled PostgreSQL release was delivered by the PostgreSQL Community on time and came carrying 5 CVE fixes and over 65 bugs bug fixes.

Unfortunately shortly after, the release team announced that an additional out of cycle release is planned for February 26. This follow up release addresses two regressions identified in the February 12 update.

Lessons Learned Writing an MCP Server for PostgreSQL
Posted by Dave Page in pgEdge on 2026-02-18 at 06:44

Over the past few months or so, we've been building the pgEdge Postgres MCP Server, an open source tool that lets LLMs talk directly to PostgreSQL databases through the Model Context Protocol. It supports Claude, GPT, local models via Ollama, and pretty much any MCP-compatible client you can throw at it. Along the way, we've learned quite a lot about what it takes to make AI and databases work well together, and the single biggest lesson has been about tokens.If you've used an LLM for any length of time, you'll know that context windows are finite and tokens cost money. When you're working with a database, however, the problem becomes acute in a way that catching up on email or writing prose simply doesn't prepare you for. A single  on a modest table can return tens of thousands of rows, each with a dozen columns, and every character of that output consumes tokens. Multiply that across a conversation where the LLM is exploring a schema, running queries, and refining its understanding, and you can burn through a context window before anything genuinely useful has been accomplished.This post covers the strategies we developed to keep token usage under control whilst still giving the LLM enough information to be helpful. If you're building an MCP server, or just curious about the practicalities of connecting LLMs to structured data, I hope some of these lessons will save you a few wrong turns.

Choosing the right output format for tabular data

When we first built the  tool, we returned results as JSON. It seemed like the obvious choice since every LLM understands JSON and it's what most APIs speak natively. The problem became apparent almost immediately: JSON is extraordinarily wasteful for tabular data.Consider a simple query returning employee records. In JSON, every single row repeats the column names as keys, wraps every value in quotes, and adds colons, commas, and braces as structural overhead. For a table with columns , , , and , a ten-row result might look something like this:Every row carries the full[...]

PostgreSQL 19: part 2 or CommitFest 2025-09
Posted by Pavel Luzanov in Postgres Professional on 2026-02-18 at 00:00

We continue our series of articles reviewing changes in PostgreSQL 19. This time we'll look at what emerged from the September 2025 CommitFest.

The highlights from the first July CommitFest are available here: 2025-07.

  • GROUP BY ALL
  • Window functions: NULL value handling
  • Event triggers in PL/Python
  • More precise error message for incorrect routine parameter names
  • random: random date/time within a specified range
  • base64url format for encode and decode functions
  • New debug_print_raw_parse parameter
  • The log_lock_waits parameter is now enabled by default
  • pg_stat_progress_basebackup: backup type
  • vacuumdb: collecting statistics on partitioned tables
  • Buffer cache: using clock-sweep algorithm to find free buffers
  • Fake table aliases in queries

...

pg_clickhouse v0.1.4
Posted by David Wheeler on 2026-02-17 at 22:24

Just a quick post to note the release of pg_clickhouse v0.1.4. This v0.1 maintenance release can be upgraded in-place and requires no ALTER EXTENSION UPDATE command; as soon as sessions reload the shared library they’ll be good to go.

Thanks in part to reports from attentive users, v0.1.4’s most significant changes improve the following:

  • The binary driver now properly inserts NULL into a Nullable(T) column. Previously it would raise an error.
  • The http driver now properly parses arrays. Previously it improperly included single quotes in string items and would choke on brackets ([]) in values.
  • Both drivers now support mapping a ClickHouse String types to Postgres BYTEA columns. Previously the worked only with text types, which is generally preferred. But since ClickHouse explicitly supports binary data in String values (notably hash function return values), pg_clickhouse needs to support it, as well.

Get it in all the usual places:

My thanks to pg_clickhouse users like Rahul Mehta for reporting issues, and to my employer, ClickHouse, for championing this extension. Next up: more aggregate function mapping, hash function pushdown, and improved subquery (specifically, SubPlan) pushdown.

PostgreSQL 19: part 1 or CommitFest 2025-07
Posted by Pavel Luzanov in Postgres Professional on 2026-02-17 at 00:00

We’re launching a new series of articles covering the changes coming up in PostgreSQL 19. This first article focuses on the events from last summer’s July CommitFest.

  • Connection service file in libpq parameter and psql variable
  • regdatabase: a type for database identifiers
  • pg_stat_statements: counters for generic and custom plans
  • pg_stat_statements: FETCH command normalization
  • pg_stat_statements: normalizing commands with parameter lists in IN clauses
  • EXPLAIN: Memoize node estimates
  • btree_gin: comparison operators for integer types
  • pg_upgrade: optimized migration of large objects
  • Optimized temporary table truncation
  • Planner: incremental sort in Append and MergeAppend nodes
  • Domain constraint validation no longer blocks DML operations
  • CHECKPOINT command parameters
  • COPY FROM: skipping initial rows
  • pg_dsm_registry_allocations: dynamic shared memory (DSM) usage

...

pg_background: make Postgres do the long work (while your session stays light)
Posted by Vibhor Kumar on 2026-02-16 at 16:25

There’s a special kind of joy in watching a database do something heavy… without making your app threads cry.

That’s the promise of pg_background: execute SQL asynchronously in background worker processes inside PostgreSQL, so your client session can move on—while the work runs in its own transaction. 

It’s a deceptively simple superpower:

  • Kick off a long-running query (or maintenance) without holding the client connection open
  • Run “autonomous transaction”-style side effects (commit/rollback independent of the caller)
  • Monitor, wait, detach, or cancel explicitly
  • Keep the operational model “Postgres-native” instead of adding another job system  

What pg_background is (and what it isn’t)

pg_background enables PostgreSQL to execute SQL commands asynchronously in dedicated background worker processes. Unlike approaches such as opening another connection (e.g., dblink) or doing client-side async orchestration, these workers run inside the server with local resources and their own transaction scope. 

It’s not a full-blown scheduler. It’s not a queueing platform. It’s a sharp tool: “run this SQL over there, and let me decide how to interact with it.”

Why you’d want this in production

I’m opinionated here: most teams don’t need more moving parts. They need fewer, with better failure modes.

pg_background shines when you need:

  • Non-blocking operations: launch long queries without pinning client connections  
  • Autonomous transactions: commit/rollback independently of the caller’s transaction  
  • Resource isolation & safer cleanup: worker lifecycle is explicit (launch / result / detach / cancel / wait)  
  • Server-side observability: list workers and see state/errors (v2)  

Common production patterns:

  • background VACUUM / ANALYZE / REINDEX
  • async backfills or data repairs
  • fire-and-forget audit/outbox writes
  • “do the expensive part later” workflows (especially in OLTP-heavy apps)

The big deal: v2 API (recomme

[...]

The MCD: Your Rosetta Stone for Turning "We Need a Database" into Actual Requirements
Posted by Lætitia AVROT on 2026-02-16 at 00:00
Part 2 of the MERISE series New here? Part 1 covers why MERISE exists and how it differs from ER modeling. Short version: it’s a conversation framework that forces business stakeholders to articulate their actual rules before you write a single line of SQL. Last time, I introduced MERISE and why this French methodology kicks ass for database modeling. Today, we’re diving into the first and most crucial step: the Conceptual Data Model (MCD).

Autonomous Postgres: From Speed to Trust
Posted by Vibhor Kumar on 2026-02-15 at 19:01

Over the last few years, one idea has been quietly taking shape in the Postgres world:

Autonomy in Postgres isn’t a feature.

It’s a posture.

It’s not a checkbox you tick in a product matrix.

It’s not a single “autonomous mode” toggle in a control panel.

It’s a layered way of designing, operating, and governing Postgres so that the system takes on more of the operational load—without giving up human intent, judgment, or control.

On LinkedIn, I explored this idea as a five-day series on “Autonomous Postgres.” Each day focused on a different capability:

  1. Self-Setup – speed through intent
  2. Self-Observe – awareness through visibility
  3. Self-Tune – learning through feedback
  4. Self-Heal – resilience without panic
  5. Self-Govern – accountability through boundaries

This post brings that journey together and goes a level deeper—especially on what this means for teams, trust, and leadership.


Autonomy as a Posture, Not a Product

When vendors say “autonomous database,” it often sounds magical:

self-driving, self-repairing, self-securing…

But the reality, especially in Postgres, is much more grounded and much more interesting.

Autonomy emerges when:

  • provisioning becomes predictable
  • telemetry becomes rich and actionable
  • tuning becomes data-driven, not folklore-driven
  • recovery becomes designed, not heroic
  • and governance becomes continuous, not episodic

In other words: architecture, not marketing.

Let’s walk through the five layers with that lens.


1⃣ Self-Setup — From Manual Builds to Intent

Everything starts with how Postgres is created and configured.

In many organizations, clusters still appear through a mix of muscle memory and tribal knowledge. Someone “who knows how we do it” stands up instances; another team tweaks them later; environments drift; no one is quite sure why two clusters behave differently.

Self-setup changes that:

  • Infrastructure as Code – P
[...]

Contributions for week 6, 2026
Posted by Cornelia Biacsics in postgres-contrib.org on 2026-02-15 at 14:36

FOSSASIA PGDay 2026 ​​talk selection committee met to finalize the schedule

Call for Paper Committee - Members:

  • Tatsuo Ishii
  • Amit Kapila
  • Amit Langote

Seattle Meetup took place on January 12 2026, organized by Lloyd Albin and Jeremy Schneider. Claire Giordano delivered a talk there.

PostgreSQL Edinburgh meetup Feb 2026 was organised by Jimmy Angelakos - he wrote a community blog post about this meetup: PostgresEDI Feb 2026 Meetup — Two Talks

Speaker:

  • Alastair Turner
  • Sean Hammond

The Scott Shambaugh Situation Clarifies How Dumb We Are Acting
Posted by Jeremy Schneider on 2026-02-13 at 19:05

My personal blog here is dedicated to tech geek material, mostly about databases like postgres. I don’t get political, but at the moment I’m so irritated that I’m making the extraordinary exception to veer into the territory of flame-war opinionating…

This relates to Postgres because Scott is a volunteer maintainer on an open source project called matplotlib and the topic is something that we are all navigating in the open source space. Last night at the Seattle Postgres User Group meetup Claire Giordano gave a presentation about how the postgres community works and this was one of the first topics that came up in the Q&A at the end! Like every open source project, Postgres is trying to figure out how to deal with the rapid change of the industry as new, powerful, useful AI tools enable us to do things we couldn’t do before (which is great). Just two weeks ago, the CloudNativePG project released an AI Policy which builds on work from the Linux Foundation and discussion around the Ghostty policy. We’re in the middle of figuring this out and we’re working hard.

Just now, I saw this headline on the front page of the Wall Street Journal:

I personally find this to be outright alarming. And it’s the most clear expression that I’ve seen of deeply wrong, deeply concerning language we’ve all been observing. Many of us in tech communities are complicit in this, and now even press outlets like the WSJ are joining us in complicity.

Corrected headline: Software Engineer Responsible for Bullying, Due to Irresponsible Use of AI, Has Not Yet Apologized

This article uses language I hear people use all the time in the tech community: Several hours later, the bot apologized to Shambaugh for being “inappropriate and personal.”

This language basically removes accountability and responsibility from the human, who configured an AI agent with the ability to publish content that looks like a blog with zero editorial control – and I haven’t looked deeply but it seems like there may not be clear attribution

[...]

PostgresEDI Feb 2026 Meetup — Two Talks
Posted by Jimmy Angelakos on 2026-02-13 at 17:00

What a great follow-up for our second PostgresEDI meetup! 🐘

First off, a huge thank you for braving the snow 🌨️ last evening in Edinburgh , and many thanks to the two speakers who made it a great night.

Obligatory title slide

We gathered at the University of Edinburgh's Lister Learning and Teaching Centre for another evening of networking and technical deep dives. Pizza and refreshments were kindly sponsored by pgEdge.

It was great to see familiar faces from our launch event as well as new ones too, from application developers eager to learn more about the ecosystem to seasoned systems/database administrators.

For those who couldn't make it, or for those who want to revisit the technical details, here is a recap of the talks with slides and resources.

The Talks

Follow My Leader — connecting client applications after server cutover or failover

Alastair Turner (Percona)

Alastair Turner presenting at the PostgreSQL Edinburgh meetup Alastair Turner breaking down connection handling during failover events.

Alastair kicked off the evening with a critical look at the often-hidden "drama" of a failover. It's one thing to promote a replica successfully, but it's entirely another to ensure your client applications can gracefully reconnect to the right Postgres instance.

He framed the solution through the "Dramatis Personae" of the stack: the Application, Network, and Database teams, walking us through options ranging from client library configurations to network-level solutions like HAProxy, PgBouncer, or VIP managers. Crucially, he showed how to pick the right approach based on your environment and your appetite for complexity. A practical talk that resonates with everyone running Postgres in production.

📊 View the slides: Follow my Leader — connecting client applications after server cutover or failover (PDF)


Transactional Job Queues and the Two Generals' Problem

Sean Hammond (seanh.cc)

Sean Hammond presenting at the PostgreSQL Edinburgh meetup Sean Hammond discussing the complexities of the Two Generals' Problem.

After the break, Sean took the stage with a proper "story from the trenches". H

[...]

Read efficiency issues in Postgres queries
Posted by Michael Christofides on 2026-02-13 at 13:08

A lot of the time in database land, our queries are I/O constrained. As such, performance work often involves reducing the number of page reads. Indexes are a prime example, but they don’t solve every issue (a couple of which we’ll now explore).

The way Postgres handles consistency while serving concurrent queries is by maintaining multiple row versions in both the main part of a table (the “heap”) as well as in the indexes (docs). Old row versions take up space, at least until they are no longer needed, and the space can be reused. This extra space is commonly referred to as “bloat”. Below we’ll look into both heap bloat and index bloat, how they can affect query performance, and what you can do to both prevent and respond to issues.

In pgMustard, we originally called tips around this “Bloat Likelihood”, but we learned that bloat isn’t the only way queries can end up reading more data than necessary. Another is to do with data locality—for example if a query needs to read several rows that all happen to be on a single page, that’s quicker than if those rows are each on separate pages. We’ll look into this below too. As such, the tips in pgMustard are now titled “Read Efficiency”.

These read efficiency issues can be tricky to spot, especially without looking at the buffer numbers reported in EXPLAIN ANALYZE and pg_stat_statements, so I don’t see them discussed much. However, they are relatively common in slow query plans that I see, so I thought it might be worth writing about.

Bloat

To demonstrate bloat issues, let’s create a simple table and populate it with some data:

CREATE TABLE read_efficiency_demo (
   id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
   text1 text NOT NULL,
   text2 text NOT NULL,
   text3 text NOT NULL);

INSERT INTO read_efficiency_demo (text1, text2, text3)
   SELECT
      md5(random()::text),
      md5(random()::text),
      md5(random()::text)
   FROM generate_series(1, 1_000_000);

VACUUM ANALYZE read_efficiency_demo;

In a moment we’ll manufacture so

[...]

Row-level and Column-level Security - Oracle vs PostgreSQL
Posted by Goutham Reddy in HexaCluster on 2026-02-13 at 04:14
Securing sensitive data requires more than just a firewall; let's see how Oracle implements VPD and PostgreSQL implement Row and Column level security

How We Optimized Top K in Postgres
Posted by Ming Ying in ParadeDB on 2026-02-13 at 00:00
How ParadeDB uses principles from search engines to optimize Postgres' Top K performance.

Introducing Bluebox Docker: A Living PostgreSQL Sample Database
Posted by Ryan Booz on 2026-02-12 at 20:47
Bluebox Docker provides a realistic, continuously-updating sample database for PostgreSQL users. It automates data generation, simulating a video rental kiosk with real movie data. Users can easily set it up with a single command, facilitating learning and practice of PostgreSQL features without dealing with stale data.

pgDay Paris 2026 - EARLYBIRD ending soon!
Posted by Vik Fearing in PostgreSQL Europe on 2026-02-12 at 13:23

Only ONE WEEK LEFT for the earlybird tickets. Make sure you get yours before it's too late!

https://2026.pgday.paris/registration/

PostgreSQL v19: Password expiration warnings.
Posted by Gilles Darold in HexaCluster on 2026-02-12 at 07:09
This was a long requested server side feature that have been committed recently to PostgreSQL core to inform users that their passwords must be changed before the expiration limit.

Introducing mnemosyne systems
Posted by Luca Ferrari on 2026-02-11 at 00:00

A new entity in the PostgreSQL landscape.

Introducing mnemosyne systems

The last week a new entity appeared in the PostgreSQL landscape: **mnemosyne systems **, the last creation of Jesper Pedersen.

The initial page says it all:



mnemosyne systems create intelligent solutions for PostgreSQL that allows you to achieve your disaster recovery (DR) and high-availability (HA) goals.



What are those intelligent solutions supposed to be?
If you take a look at the products overview you will see that the foundation is based on pgagroal (a fast and reliable connection pooler), pgmoneta (a backup/restore solution) and pgexporter (a Prometheus metrics handler). All together, these solutions, give you the capability to provide High Availability (HA) and Disaster Recovery (DR) to your PostgreSQL clusters. The keypoint is to achieve, as much as possible, a whole platform where human intervention can be narrowed and Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are as lower as possible, keeping in mind that having both set to zero could be impossible, especially in large scale scenarios (e.g., petabyte databases).

All the foundation, i.e., the above mentioned products, are all Open Source Software, a development approach that Mnemosyne Systems encourage from the very beginning. Quite frankly, I do believe that Open Source is the only good way of developing software.

The team, including yours truly, is made by very smart people and is going to grow in the near future. Being a member of this team, I can truly say I learnt a lot from other members and I really believe this has been so far an exciting and very useful experience.

It is worth having a look at the vision of this new entity, which is quite well explained by its founder Jesper in this page: to create a complete solution [...] nd work with the Open Source community [...].

Last but not least, in case you are thinking about, let’s learn who Mnemosyne was.

Do PostgreSQL Sub-Transactions Hurt Performance?
Posted by Shane Borden on 2026-02-10 at 16:16

The short answer is always “maybe”. However, in the following post, I hope to demonstrate what creates a sub-transactions and what happens to the overall transaction id utilization when they are invoked. I will also show how performance is affected when there are lots of connections creating and consuming sub-transactions.

First, it is important to understand what statements will utilize a transaction id and which ones may be more critical (expensive) than others:

  • Calling Nested Procedures: 🟢 Free. No new XIDs are used. They share the parent’s transaction.
  • BEGIN…END (No Exception Block): 🟢 Free. Just organization.
  • COMMIT: 🟡 Expensive. Burns a main Transaction ID (finite resource, leads to Vacuum Freeze).
  • EXCEPTION: 🔴 Dangerous. Creates a subtransaction (performance killer).

So why is misplaced EXCEPTION logic possibly a performance killer? PostgreSQL is optimized to handle a small number of open subtransactions very efficiently. Each backend process has a fixed-size array in shared memory (part of the PGPROC structure) that can hold up to 64 open subtransaction IDs (XIDs) for the current top-level transaction. As long as your nesting depth stays below 64, PostgreSQL manages everything in this fast, local memory array. It does not need to use the Subtrans SLRU (Simple Least Recently Used) subsystem, which is what pg_stat_slru tracks. The problem is that the utilization of subtransaction IDs (XIDs) can get out of hand rather quickly if you are not paying attention to your application flow and once PostgreSQL runs out of fast RAM slots (PGPROC array) and spills tracking data to the slow SLRU cache (pg_subtrans), performance degrades non-linearly (often 50x–100x slower) and causes global locking contention that can freeze other users.

One of the other byproducts of utilizing too many sub-transactions is the additional WAL that will be generated. All of these can be demonstrated by a simple block of code:

--------------------------------------------------------
[...]

Monitoring query plans with pgwatch and pg_stat_plans
Posted by ahmed gouda in Cybertec on 2026-02-10 at 06:00

The PostgreSQL ecosystem just introduced a new pg_stat_plans extension. It's similar to pg_stat_statements but it tracks aggregated statistics for query plans instead of SQL statements.

It exposes a SQL interface to query those statistics via the pg_stat_plans view, and in this post, I will explore how easy it is to integrate such extensions with pgwatch for those who want to improve/customize the metrics gathered from a monitored database.

Before moving ahead, please make sure that you have installed the extension and loaded it into your PostgreSQL server by following the installation guide in the extension repo.

Writing The Query

I have written the below query to join pg_stat_statements with pg_stat_plans and retrieve me the query plans for the most resource intensive queries to debug them:

WITH /* pgwatch_generated */ p_data AS (
  SELECT
    max(p.dbid)::int8 as dbid,
    max(p.queryid)::int8 as queryid,
    p.planid,
    sum(p.calls)::int8 as calls,
    round(sum(p.total_exec_time)::numeric, 3)::double precision AS total_plan_exec_time,
    max(p.plan::varchar(15000)) as plan
  FROM pg_stat_plans p
  WHERE
    dbid = (SELECT oid FROM pg_database WHERE datname = current_database())
  GROUP BY p.planid
), q_data AS (
  SELECT
    max(s.dbid) as dbid,
    queryid,
    sum(s.calls)::int8 AS calls,
    round(sum(s.total_exec_time)::numeric, 3)::double precision AS total_exec_time,
    sum(shared_blks_read)::int8 AS shared_blks_read,
    sum(shared_blks_written)::int8 AS shared_blks_written
  FROM
    pg_stat_statements s
  WHERE
    calls > 5
    AND total_exec_time > 5
    AND dbid = (SELECT oid FROM pg_database WHERE datname = current_database())
    AND NOT upper(s.query::varchar(50))
      LIKE ANY (ARRAY[
        'DEALLOCATE%',
        'SET %',
        'RESET %',
 
[...]

Teaching an LLM What It Doesn't Know About PostgreSQL
Posted by Dave Page in pgEdge on 2026-02-10 at 05:27

Large language models know a remarkable amount about PostgreSQL. They can write SQL, explain query plans, and discuss the finer points of MVCC with genuine competence. But there are hard limits to what any model can know, and when you're building tools that connect LLMs to real databases, those limits become apparent surprisingly quickly.The core issue is training data. Models learn from whatever was available at the time they were trained, and that corpus is frozen the moment training ends. PostgreSQL 17 might be well represented in a model's training data, but PostgreSQL 18 almost certainly isn't if the model was trained before the release. Extensions and tools from smaller companies are even worse off, because there simply isn't enough public documentation, blog posts, and Stack Overflow discussions for the model to have learned from. And products that were released after the training cutoff are invisible entirely.This is the problem we set out to solve with the knowledgebase system in the pgEdge Postgres MCP Server. Rather than hoping the LLM already knows what it needs, we give it a tool that lets it search curated, up-to-date documentation at query time and incorporate the results into its answers. It's RAG, in essence, but tightly integrated into the MCP tool workflow so the LLM can use it as naturally as it would run a SQL query.

Products the LLM has never heard of

To understand why this matters, consider a few of the products whose documentation we index.Spock is an open source PostgreSQL extension that provides asynchronous multi-master logical replication. It allows multiple PostgreSQL nodes to accept both reads and writes simultaneously, with automatic conflict resolution between nodes. It supports automatic DDL replication, configurable conflict resolution strategies, row filtering, column projection, and cross-version replication for zero-downtime upgrades. Spock grew out of earlier work on pgLogical and BDR2, but has been substantially enhanced since pgEdge first introduced it in 2023.If you[...]

PAX: The Cache Performance You're Looking For
Posted by Lætitia AVROT on 2026-02-10 at 00:00
Thanks to Boris Novikov, who pointed me in the PAX direction in the first place and followed up with many insightful technical discussions. I’m grateful for all his time and the great conversations we’ve had—and continue to have. I’ve been obsessed with database storage layouts for years now. Not the sexy kind of obsession, but the kind where you wake up at 3 AM thinking “wait, why are we loading 60 useless bytes into cache just to read 4 bytes?

PostGIS Patch Releases
Posted by Regina Obe in PostGIS on 2026-02-09 at 00:00

The PostGIS development team is pleased to provide bug fix releases for PostGIS 3.0 - 3.6. These are the End-Of-Life (EOL) releases for PostGIS 3.0.12 and 3.1.13. If you haven’t already upgraded from 3.0 or 3.1 series, you should do so soon.

Contributions for week 5, 2026
Posted by Cornelia Biacsics in postgres-contrib.org on 2026-02-08 at 20:27

PGBLR Meetup met on February 7, organized by Organizers Amit Kapila, Kuntal Ghosh, Sivji Kumar Jha and Vigneshwaran C.

Speaker:

  • Shreya R. Aithal
  • Suresh
  • Y V Ravi Kumar

Mumbai PostgreSQL UserGroup met on February 7 - organized by Ajit Gadge, Sovenath Shaw and Deepak Mahto

Speaker:

  • Hari Kiran
  • Ajit Gadge
  • Kamesh Sampath
  • Jeevan Chalke
  • Drishti Jain
  • Manan Gupta
  • Ashotosh Bapat

CERN PGDay 2026 took place on February 6, organized by:

  • Maurizio De Giorgi
  • Andrzej Nowicki
  • Tobias Bussmann
  • Andreas Geppert
  • Markus Wanner
  • Daniel Westermann

Speaker:

  • Maurizio De Giorgi
  • Markus Wanner
  • Jaroslav Guenther
  • Konstantina Skovola
  • Teresa Lopes
  • Robert Treat
  • Álvaro Hernández
  • Josef Machytka
  • Marion Baumgartner

PG DATA Conference 2026, CfP committee selected the talks. Committee members are listed below:

  • Dian Fay - Chair
  • Gulcin Yildirim Jelinek
  • Jay Miller
  • Andrew Atkinson
  • Derk van Veen

Community Blog Posts:

eBPF Tracing of PostgreSQL Spinlocks
Posted by Jan Kristof Nidzwetzki on 2026-02-08 at 00:00

PostgreSQL uses a process-based architecture where each connection is handled by a separate process. Some data structures are shared between these processes, for example, the shared buffer cache or the write-ahead log (WAL). To coordinate access to these shared resources, PostgreSQL uses several locking mechanisms, including spinlocks. Spinlocks are intended for very short-term protection of shared structures: rather than immediately putting a waiting process to sleep, they busy-wait and repeatedly check whether the lock is free. Under contention, PostgreSQL also applies an adaptive backoff that can include brief sleeps.

This article explains what spinlocks are and how they are implemented in PostgreSQL. It also describes how spinlocks can be monitored and demonstrates how my new pg_spinlock_tracer tool can be used to trace spinlock internals using eBPF.

What are Spinlocks?

When multiple processes need to access a shared resource, locks are used to ensure that only one process can modify the resource at a time. If a lock is not available, the waiting process is put to sleep until the lock can be acquired. This reduces CPU usage since the waiting process does not consume CPU cycles while sleeping. However, putting a process to sleep and waking it up again involves context switches, which take time and add latency to the operation. If the lock is expected to be held for a very short time, it may be more efficient for the waiting process to continuously check if the lock is available instead of sleeping. That is what spinlocks do: the lock spins in a loop, repeatedly checking the lock’s status until it can be acquired. Using a spinlock avoids the sleep/wakeup latency but can consume CPU cycles while spinning. If the hardware has only a few CPU cores, spinning can waste CPU cycles and lead to worse overall performance.

Implementation in PostgreSQL

The PostgreSQL implementation of spinlocks is mainly in src/include/storage/s_lock.h and src/backend/storage/lmgr/s_lock.c. The spinlock API provides

[...]

FOSDEM 2026 — Defining "Drop-in Replacement" and Beyond
Posted by Jimmy Angelakos on 2026-02-06 at 18:00

Obligatory photo from FOSDEM 2026. Credit: Marcelo Altmann

Back from Brussels where I was doing the annual pilgrimage to the awesome FOSDEM gathering. I was very pleased to see the popularity and positive vibe of the (first time) joint Databases Devroom. Community-oriented and community-run conferences are the best IMHO.

It was great to share the stage this time with Daniël van Eeden, an engineer from PingCAP and a MySQL Rockstar. I enjoyed the collaboration because we approached a thorny issue from two different aspects: the PostgreSQL emerging standard and the implementation of MySQL compatibility.

The Talk: "Drop-in Replacement"

Our presentation, "Drop-in Replacement: Defining Compatibility for Postgres and MySQL Derivatives", tackled a problem in our industry: the "wild west" of marketing claims. The success of open source databases has created an ecosystem of derivatives claiming "drop-in compatibility."

The reality, however, is that this often leads to user confusion and brand dilution. As we discussed, compatibility is not an absolute Yes/No situation—even different versions of the same database are not 100% compatible due to deprecated or added features.

The Standard: The Riga Consensus

In my section of the talk, I focused on the governance perspective. I presented the findings from the "Establishing the PostgreSQL Standard" working group held at PGConf.EU 2025 in Riga last October.

We are pivoting from a binary "Pass/Fail" certification to a granular compatibility matrix. We need to ensure that when someone says "Postgres Compatible," they don't just mean matching the wire protocol. We need to look at:

  • Core SQL & Implicit Behaviours: It's not just about functions; it's about undocumented behaviors users rely on, like how INSERT ... SELECT ... ORDER BY behaves.
  • System Catalogs: Monitoring tools rely on the pg_catalog being present and predictable.
  • No Silent Failures: A command like CREATE INDEX must actually build the index, not just return "success" while doing nothing.

The Implementation: T

[...]

Prague PostgreSQL Dev Day – a very late follow up
Posted by Henrietta Dombrovskaya on 2026-02-06 at 16:00

Everyone who was in Prague on January 27-28 has already posted their feedback and moved on, so I am late, as it often happens. However, I still maintain that better late than never!

This year was the second time I attended this event, and this time, I didn’t have to rush out immediately after my training session, and was able to stay longer and to connect with many attendees. Same as last time, I was very impressed with the whole event organization, and a very warm and welcoming atmosphere at the event. Many thanks to Tomas Vondra!

I delivered the same workshop I did last year. Last year, I ran out of time despite my best efforts, and since I hate to be that presenter who doesn’t keep track of time and then rushes through the last twenty slides, I made an effort to remove the content I presumed I won’t take time to cover, in advance. It looks like I overdid it a little bit, because I ended up finishing earlier, but I think it’s still better than later 🙂

My usual lament about these training sessions is gender uniformity, and I still do not know what is the right solution for this problem!

Also, many thanks to Gülçin Yıldırım Jelínek for extending my 63-rd birthday celebraiton for another week 🙂

As it often happens, my only regret is that there were so many interesting talks happening at the same time! I could avoid FOMO if I would check out the Nordic PG schedule earlier, because some of the talks will be replayed there. I could plan it better! But in any case, I had a great time 🙂

Reading Buffer statistics in EXPLAIN output
Posted by Radim Marek on 2026-02-06 at 15:00

In the article about Buffers in PostgreSQL we kept adding EXPLAIN (ANALYZE, BUFFERS) to every query without giving much thought to the output. Time to fix that. PostgreSQL breaks down buffer usage for each plan node, and once you learn to read those numbers, you'll know exactly where your query spent time waiting for I/O - and where it didn't have to. That's about as fundamental as it gets when diagnosing performance problems.

PostgreSQL 18: BUFFERS by Default
Starting with PostgreSQL 18, EXPLAIN ANALYZE automatically includes buffer statistics - you no longer need to explicitly add BUFFERS. The examples below use the explicit syntax for compatibility with older versions, but on PG18+ a simple EXPLAIN ANALYZE gives you the same information.

A complete example

For this article we will use following schema and seeded data.

CREATE TABLE customers (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE orders (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id integer NOT NULL REFERENCES customers(id),
    amount numeric(10,2) NOT NULL,
    status text NOT NULL DEFAULT 'pending',
    note text,
    created_at date NOT NULL DEFAULT CURRENT_DATE
);

INSERT INTO customers (name)
SELECT 'Customer ' || i
FROM generate_series(1, 2000) AS i;

-- seed data: ~100,000 orders spread across 2022-2025
INSERT INTO orders (customer_id, amount, status, note, created_at)
SELECT
    (random() * 1999 + 1)::int,
    (random() * 500 + 5)::numeric(10,2),
    (ARRAY['pending','shipped','delivered','cancelled'])[floor(random()*4+1)::int],
    CASE WHEN random() < 0.3 THEN 'Some note text here for padding' ELSE NULL END,
    '2022-01-01'::date + (random() * 1095)::int  -- ~3 years of data
FROM generate_series(1, 100000);

-- make sure stats are up to date
ANALYZE customers;
ANALYZE orders;

-- we are going to skip indexes on purpose

-- and fire sample query
select count(1) from customers;

Let's start with a random query

EXPLAIN (ANALYZE, BUFFERS)
SELECT 
[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.