Create and debug PostgreSQL extension using VS Code

Posted by Sergey Solovev on 2025-10-18 at 15:27

In this tutorial we will create PostgreSQL extension ban_sus_query. It will check that DML queries contain predicates, otherwise will just throw an error.

Next, in order not to mislead up, I will use term contrib for PostgreSQL extension, and for extension for PostgreSQL Hacker Helper VS Code extension.

This tutorial is created not only for newbies in PostgreSQL development, but also as a tutorial for VS Code extension PostgreSQL Hacker Helper. Documentation for it you can find here.

Creating initial files

PostgreSQL has infrastructure for contrib building and installation. In short, contribs have a template architecture - most parts are common for all.

So, for faster contrib creation we will use command: PGHH: Bootstrap extension.

It will prompt us to bootstrap some files - choose only C sources.

After that we will have our contrib files created:

Initial code

Query execution pipeline has 3 stages:

Parse/Semantic analysis - query string parsing and resolving tables
Plan - query optimization and creating execution plan
Execution - actual query execution

Our logic will be added to the 2 stage, because we must check real execution plan, not Query.
This is because after multiple transformations query can be changed in multiple ways - predicates can be deleted or added, therefore we may get a completely different query than in the original query string.

To implement that we will create hook on planner - planner_hook. Inside we will invoke actual planner and check it's output for the existence of predicates.

Starter code is the following:

#include "postgres.h"

#include "fmgr.h"
#include "optimizer/planner.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

static planner_hook_type prev_planner_hook;

void _PG_init(void);
void _PG_fini(void);

static bool
is_sus_query(Plan *plan)
{
    /* ... */
    return false;
}

static PlannedStmt *
ban_sus_query_planner_hook(Query *parse,
                           const char *query_string,

[...]

PGConf.EU 2025: The Underground Map for Database Nerds

Posted by Mayur B. on 2025-10-17 at 22:57

PGConf.EU schedule can feel like a parallel query gone wild, so many great talks but not enough CPU.
I built this guide to help my fellow database nerds skip the overwhelm and enjoy the best prod-DBA focussed sessions without a single deadlock.
Follow this path, and you’ll cruise through the conference like a perfectly tuned autovacuum.

🗓️ Wednesday, Oct 22 — Warming Up the Buffers

11:15 – 12:05 in Omega 1 : Don’t Do That!
Laurenz Albe reminds us that every bad Postgres habit comes with a sequel called “incident report.”

13:05 – 13:35 in Omega 2 : Parsing Postgres Logs the Non-pgBadger Way
Kaarel Moppel shows that pgweasel and caffeine can out-analyze any dashboard.

13:45–14:35 in Alfa : Improved Freezing in Postgres Vacuum: From Idea to Commit
Melanie Plageman walks us through the icy depths of tuple immortality.

14:45–15:35 in Omega 2 : Operational Hazards of Running PostgreSQL Beyond 100 TB
Teresa Lopes shares real stories and engineering lessons from scaling Postgres into the terabyte realm, where every decision costs you.

16:05–16:55 in Omega 2 : What You Should Know About Constraints (and What’s New in 18)
Gülçin Yıldırım Jelínek explores how new enhancement to constraints in PG 18 make data integrity both smarter and more flexible.

17:05–17:55 in Omega 1 : Hacking pgvector for Performance
Daniel Krefl reveals clever hacks to push filtering and indexing deeper into pgvector for faster, leaner similarity searches.

🧱 Thursday, October 23 — The Day of Observability and Enlightenment

09:25–10:15 in Omega 2 : Unified Observability: Monitoring Postgres Anywhere with OpenTelemetry (Yogesh Jain)
Learn how to unify metrics, logs, and traces across cloud, containers, and bare-metal Postgres instances using OpenTelemetry to build scalable, vendor-agnostic observability.

10:25–10:55 in Omega

[...]

Is Postgres Read Heavy or Write Heavy? (And Why You Should You Care)

Posted by David Christensen in Crunchy Data on 2025-10-17 at 12:00

When someone asks about Postgres tuning, I always say “it depends”. What “it” is can vary widely but one major factor is the read and write traffic of a Postgres database. Today let’s dig into knowing if your Postgres database is read heavy or write heavy.

Of course write heavy or read heavy can largely be inferred from your business logic. Social media app - read heavy. IoT logger - write heavy. But …. Many of us have mixed use applications. Knowing your write and read load can help you make other decisions about tuning and architecture priorities with your Postgres fleet.

Understanding whether a Postgres database is read-heavy or write-heavy is paramount for effective database administration and performance tuning. For example, a read-heavy database might benefit more from extensive indexing, query caching, and read replicas, while a write-heavy database might require optimizations like faster storage, efficient WAL (Write-Ahead Log) management, table design considerations (such as fill factor and autovacuum tuning) and careful consideration of transaction isolation levels.

By reviewing a detailed read/write estimation, you can gain valuable insights into the underlying workload characteristics, enabling informed decisions for optimizing resource allocation and improving overall database performance.

Read and writes are not really equal

The challenge here in looking at Postgres like this is that reads and writes are not really equal.

Postgres reads data in whole 8kb units, called blocks on disk or pages once they’re part of the shared memory. The cost of reading is much lower than writing. Since the most frequently used data generally resides in the shared buffers or the OS cache, many queries never need additional physical IO and can return results just from memory.
Postgres writes by comparison are a little more complicated. When changing an individual tuple, Postgres needs to write data to WAL defining what happens. If this is the first write after a checkpoint, this could inc

[...]

Configuring Linux Huge Pages for PostgreSQL

Posted by Umair Shahid in Stormatics on 2025-10-17 at 10:28

Huge pages are a Linux kernel feature that allocates larger memory pages (typically 2 MB or 1 GB instead of the normal 4 KB). PostgreSQL’s shared buffer pool and dynamic shared memory segments are often tens of gigabytes, and using huge pages reduces the number of pages the processor must manage. Fewer page‑table entries mean fewer translation‑lookaside‑buffer (TLB) misses and fewer page table walks, which reduces CPU overhead and improves query throughput and parallel query performance. The PostgreSQL documentation notes that huge pages “reduce overhead … resulting in smaller page tables and less CPU time spent on memory management”[1].

This blog explains why huge pages matter and how to configure them correctly on modern PostgreSQL releases (v15–v18) running on Linux. It also covers special considerations for Kubernetes, cloud services (Amazon RDS/Aurora and Azure Database for PostgreSQL), and Windows, and emphasises disabling Transparent Huge Pages (THP).

1. When to use huge pages and why they matter

Performance benefit: With large shared buffers, regular 4 KB pages cause many TLB misses. Huge pages group memory into 2 MB or 1 GB chunks, so the CPU spends less time managing memory and more on executing queries. The official docs highlight that huge pages reduce CPU overhead[1], and pganalyze notes that enabling huge pages on a dedicated PostgreSQL server can yield measurable performance improvements when shared buffers are tens of gigabytes[2].
Dedicated servers: Because huge pages reserve physical memory, they are best on dedicated database servers. Over‑allocating huge pages on multi‑purpose servers can starve the operating system or other services[3].
Use huge_pages=on for clarity: PostgreSQL’s huge_pages parameter accepts off, try (default) or on. When set to on, PostgreSQL will refuse to start if sufficient huge pages are not available; this surfaces mis‑configurations immediately[4]. The docs encourage enabling huge pages and w

[...]

Hacking Workshop for November 2025

Posted by Robert Haas in EDB on 2025-10-16 at 13:36

For next month, I'm scheduling 2 or 3 discussions of Matthias van de Meent's talk, Improving scalability; Reducing overhead in shared memory, given at 2025.pgconf.dev (talk description here). If you're interested in joining us, please sign up using this form and I will send you an invite to one of the sessions. Thanks to Matthias for agreeing to attend the sessions, and to Melanie Plageman for agreeing to serve as host. (I normally host, but am taking a month off. We will also skip December due to the end-of-year holidays.)

Sanitized SQL

Posted by Jeremy Schneider on 2025-10-16 at 03:57

A couple times within the past month, I’ve had people send me a message asking if I have any suggestions about where to learn postgres. I like to share the collection of links that I’ve accumulated (and please send me more, if you have good ones!) but another thing I always say is that the public postgres slack is a nice place to see people asking questions (Discord, Telegram and IRC also have thriving Postgres user communities). Trying to answer questions and help people out can be a great way to learn!

Last month there was a brief thread on the public postgres slack about the idea of sanatizing SQL and this has been stuck in my head for awhile.

The topic of sensitive data and SQL is actually pretty nuanced.

First, I think it’s important to directly address the question about how to treat databases schemas – table and column names, function names, etc. We can take our cue from the large number industry vendors with data catalog, data lineage and data masking products. Schemas should be internal and confidential to a company – but they are not sensitive in the same way that PII or PCI data is. Within a company, it’s desirable for most schemas to be discoverable by engineers across multiple development teams – this is worth the benefits of better collaboration and better architecture of internal software.

Unfortunately, the versatile SQL language does not cleanly separate things. A SQL statement is a string that can mix keywords and schema and data all together. As Benoit points out in the slack thread – there are prepared (parameterized) statements, but you can easily miss a spot and end up with literal strings in queries. And I would add to this that most enterprises will also have occasional needs for manual “data fixes” which may involve basic scripts where literal values are common.

Benoit’s suggestion was to run a full parse of the query text. This is a good idea – in fact PgAnalyze already maintains a standalone open-source library which can be used to directly leverage Postgres’

[...]

Prairie Postgres Birthday Meetup

Posted by Henrietta Dombrovskaya on 2025-10-16 at 03:04

Huge thanks to everyone who came to the Prairie Postgres meetup and celebrated our first birthday with us! Thank you for helping me to rehearse my talk, and for your insightful questions!

Here are my presentation slides:

Long_Queries_2025 Download

And we had a cake!

Keep Calm - TDE for PostgreSQL 18 Is on Its Way!

Posted by Jan Wieremjewicz in Percona on 2025-10-15 at 11:00

If you’ve been following the buzz around PostgreSQL, you’ve probably already heard that database level open source data-at-rest encryption is now available thanks to the Transparent Data Encryption (TDE) extension available in the Percona Distribution for PostgreSQL. So naturally, the next question is:

Where’s Percona Distribution for PostgreSQL 18?

The short answer:

It’s coming.

The slightly longer one:

It’s taking a bit of time, for all the right reasons.

Understanding Disaster Recovery in PostgreSQL

Posted by warda bibi in Stormatics on 2025-10-15 at 10:12

System outages, hardware failures, or accidental data loss can strike without warning. What determines whether operations resume smoothly or grind to a halt is the strength of the disaster recovery setup. PostgreSQL is built with powerful features that make reliable recovery possible.

This post takes a closer look at how these components work together behind the scenes to protect data integrity, enable consistent restores, and ensure your database can recover from any failure scenario.

What is Disaster Recovery?

Disaster Recovery (DR) refers to a set of practices and strategies designed to back up and restore databases in the event of a disaster. In this context, a disaster means any event that renders the entire database environment unusable, such as:

Cloud region outages – e.g., when AWS us-east-1 goes down and takes half the internet with it
Physical disasters – fire, flood, earthquakes, or even a backhoe cutting fiber lines
Catastrophic human error – like a faulty migration that corrupts critical tables
Major security incidents – where you must rebuild from known good backups
Power outages – extended downtime impacting availability
Hardware failures – disks, memory, or server crashes
Software failures – bugs, crashes, or corrupted processes
Cyberattacks – ransomware, data breaches, or malicious tampering
…and whatever else you can’t imagine!

The goal of DR is to ensure that the system can be quickly restored to a normal operational state in the event of an unexpected incident.

The Pillars of Business Continuity: RTO and RPO

Before designing a disaster recovery strategy, we must understand the two metrics that define it:

RPO
RTO

[...]

What's Our Vector, Victor? Building AI Apps with Postgres

Posted by Shaun Thomas in pgEdge on 2025-10-15 at 04:14

Something I’ve presented about recently (a couple times, now!) is how we can make AI actually useful with Postgres. Not the mystical robot overlord kind of AI that people worry about, but the practical, math-heavy kind that can actually help solve real problems with the data you already have.Let me be clear up front: AI isn't magic. It's math. Lots and lots of math. And if you can write SQL queries (which, let's be honest, is why you're here), you can build AI applications with Postgres. Let me show you how.

Breaking Down the AI Buzzwords

Before we get too deep into the technical stuff, let's clarify some terms that get thrown around way too much these days:LLM (Large Language Model): This is what people are actually talking to when they chat with AI systems like ChatGPT or Claude. It's the thing that generates responses that sound surprisingly human.RAG (Retrieval Augmented Generation): We're all tired of AIs making stuff up, right? RAG forces them to use reference material - your actual data - instead of just hallucinating answers. It's basically saying "here's some context, now answer based on this instead of whatever random training data you remember."Tokens: Text gets broken into chunks, and those chunks get meaning associated with them. Think of it as building up from word parts to bigger concepts, piece by piece.Embeddings: Here's where it gets interesting, and where Postgres really starts to shine. An embedding is a vector, essentially a coordinate system for those tokens. You know how a Cartesian plot has X and Y coordinates? Vectors are like that, except they're hundreds or even thousands of coordinates long. That's where AI concepts live and get their granularity.

Understanding Vectors: The School of Fish

I love using this analogy because it actually captures what's happening under the hood. Picture a school of fish in a coral reef. A vector isn't just representing the species of fish, it's capturing everything about that fish in context. The position of its fins, what other fish it's swimmin[...]

#PostgresMarathon 2-008: LWLock:LockManager and prepared statements

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-14 at 23:59

As was discussed in #PostgresMarathon 2-002, for a simple SELECT from a table, at planning time, Postgres locks the table and all of its indexes with AccessShareLock. A simple demo to remind it (let me be a bit weird here and save some bytes when typing SQL):

test=# create table t();
CREATE TABLE
test=# create index on t((1));
CREATE INDEX
test=# create index on t((1));
CREATE INDEX
test=# create index on t((1));
CREATE INDEX
test=# create index on t((1));
CREATE INDEX
test=# create index on t((1));
CREATE INDEX
test=# \d t
                Table "public.t"
 Column | Type | Collation | Nullable | Default
--------+------+-----------+----------+---------
Indexes:
    "t_expr_idx" btree ((1))
    "t_expr_idx1" btree ((1))
    "t_expr_idx2" btree ((1))
    "t_expr_idx3" btree ((1))
    "t_expr_idx4" btree ((1))

test=#
test=# begin; explain select from t;
BEGIN
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on t  (cost=0.00..39.10 rows=2910 width=0)
(1 row)

test=*# select relation::regclass, mode from pg_locks where pid = pg_backend_pid();
  relation   |      mode
-------------+-----------------
 t_expr_idx2 | AccessShareLock
 pg_locks    | AccessShareLock
 t_expr_idx3 | AccessShareLock
 t_expr_idx4 | AccessShareLock
 t_expr_idx  | AccessShareLock
 t_expr_idx1 | AccessShareLock
 t           | AccessShareLock
             | ExclusiveLock
(8 rows)

test=*#

– indeed, all indexes locked.

Using prepared statements to reduce locking

To mitigate it, we can just use prepared statements. Let's create one:

prepare test_query (int) as select from t;

And then run this snippet 7 times:

begin;
explain (verbose) execute test_query(1);

select relation::regclass, mode
from pg_locks
where pid = pg_backend_pid() and relation::regclass <> 'pg_locks'::regclass;

rollback;

Six (6) times, we'll see that all indexes are locked:

  relation   |      mode
-------------+-----------------
 t_expr_idx2 | AccessShareLock
 t_expr_idx3 | AccessShareLoc

[...]

Understanding and Setting PostgreSQL JDBC Fetch Size

Posted by Shane Borden on 2025-10-14 at 22:17

By default, the PostgreSQL JDBC driver fetches all rows at once and attempts to load them into memory vs. other drivers such as Oracle that by default only fetches 10 rows at a time. Both defaults have pros and cons, however in the context of the types of workloads I see every day, the PostgreSQL default is typically not optimal.

As a frame of reference, the default PostgreSQL fetch size is just fine if you have queries that always return small result sets. If there is a chance that larger results sets could be retrieved, a high variance of performance between small and large result sets will be seen and an explicit fetch size should be considered.

To demonstrate, I wanted to create a demo application which would create a simulated table with 2 million rows and select them with various fetch sizes:

fetchSize of 1000 (stream 1000 rows at a time)
fetchSize of 5000 (stream 5000 rows at a time)
fetchSize of 0 (fetch all rows at once) – Default

For a query returning 2 million rows, leveraging the default fetch size produce the following results:

java -cp .:/home/shaneborden_google_com/java/postgresql-42.5.4.jar DatabaseFetchSizeTest

--- Database Setup ---
  [MEMORY] Initial Baseline: 6.68 MB Used (Total Heap: 56.00 MB)
Existing table dropped.
New table created: large_data_test
Inserting 2000000 rows... Done in 44.36 seconds.
  [MEMORY] After Data Insertion: 6.72 MB Used (Total Heap: 40.00 MB)

------------------------------------------------------------
--- Running Test: Small Chunk (1000 rows) (Fetch Size: 1000) ---
------------------------------------------------------------
Executing query with fetch size 1000...
  [MEMORY] 1. Before Query Execution: 6.63 MB Used (Total Heap: 40.00 MB)
  [MEMORY] 2. After Query Execution (Data Loaded/Cursor Open): 6.86 MB Used (Total Heap: 40.00 MB)
Test Complete.
  Total Rows Read: 2000000
  Total Time Taken: 1613 ms
  [MEMORY] 3. After All Rows Processed: 6.67 MB Used (Total Heap: 68.00 MB)
  Mode: STREAMING. Expect memory usage to remain low

[...]

Three Interviews

Posted by Bruce Momjian in EDB on 2025-10-14 at 13:00

I recently did three interviews for Edd Mann's Compiled Conversations. The first is a general interview about the Postgres project, its history, and challenges. The other is a two-part (1, 2) interview about how a table is created and populated internally. Total recording time is 3.5 hours.

The PostgreSQL Village

Posted by Cornelia Biacsics in Cybertec on 2025-10-14 at 05:41

At PGDay Lowlands 2025, I had the chance to give my very first Lightning Talk. My topic? The idea of thinking about PostgreSQL as a village.

It was inspired by a phrase I recently came across: ”If you want to go fast, go alone. If you want to go far, go together.” (I’m not sure who first said this, but I really like it.)

This saying stuck with me, because it beautifully captures the essence of open source — and especially the PostgreSQL community.

Because open source is not a solo journey. It’s a shared effort — like a village. A place where people contribute, share, and look after the common good, while valuing the diversity of skills, backgrounds, and perspectives that make it thrive.

About villages and the PostgreSQL village

When I think about a village, diversity comes to my mind. But why?

Because what you can find in a village is diversity at its core:

People of different families and professions
Different passions, skills, and interests
People who sometimes take different paths, but still live together for a shared purpose
And many more

The same is true for PostgreSQL. We might come from different “families” — developers, DBAs, sales, marketers, trainers, consultants and many, many more — but we all live in the same PostgreSQL village. We all contribute in our own way.

Some write code, others teach, promote, or help organize. Some focus on users, others on infrastructure. The perspectives may differ, but the goal is the same: keeping the community and the project strong.

How I Found My Place in the Village

Although I am not a technician, as my origin is marketing related, I found my place in the PostgreSQL village.

Like many others, I first joined the community by connecting through shared interests. For me, the easiest entry point was attending my first PostgreSQL conference — PGConf.EU in Prague (2023). That’s where I truly experienced the spirit of the community, and it left a lasting impression on me. Later, I joined the PostgreSQL Soc

[...]

Getting Ready for PGConf.EU 2025

Posted by Karen Jex in Crunchy Data on 2025-10-13 at 19:13

I hope everyone's looking forward to PostgreSQL Conference Europe 2025 in Riga next week!

I can't wait to see all of my favourite Postgres people, and meet some new ones.

Before I get completely lost in last-minute preparations, I thought I'd take a few minutes to share my plans for the week. That way, you'll know where to find me if you want to introduce yourself, catch up, grab a coffee, or just ask me a question.

I'd love to hear what everyone else has planned too. Please let me know!

Without further ado, here's my agenda for the week:

Tuesday

New this year, the conference kicks off with the Community Events Day. We invited the PostgreSQL Europe Community to propose mini community-focused, participative events. There's something for everyone - from AI to Patroni and from Extensions to volunteer training.

I plan to start the week off gently with a morning of Postgres-themed crafting at Crafty Slonik. I'm looking forward to chatting about databases whilst doing some crochet. If crochet's not your thing, there'll be plenty of other options, or bring your own! If you're new to PGConf.EU and/or to the Postgres community, this is the perfect opportunity to meet some new folks and maybe make a conference buddy or two in an informal setting.

In the afternoon, I'll be at the Community Organizers Conf, learning from other PostgreSQL event organisers, and sharing what the PostgreSQL Europe Diversity Committee has been working on, especially how and why we've made PGConf.EU a Sunflower friendly event.

Hidden Disabilities Sunflower lanyards, Photography by Neil Juggins, https://neiljuggins.photoshelter.com/index

Wednesday

I'm excited to hear the keynote from Karen Sandler, of the Software Freedom Conservancy. It's going to be a great start to 3 days of amazing talks. I have no idea how I'm going to choose which ones to see!

I'll spend some time today hanging around the Community Celebration board - a place where we can pin things that we've drawn, written or created to celebrate what we love about the Postgres community and showcase the rich variety of voices and backgrou

[...]

Waiting for PostgreSQL 19 – Add IGNORE NULLS/RESPECT NULLS option to Window functions.

Posted by Hubert 'depesz' Lubaczewski on 2025-10-13 at 11:13

On 3rd of October 2025, Tatsuo Ishii committed patch: Add IGNORE NULLS/RESPECT NULLS option to Window functions. Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead, lag, first_value, last_value and nth_value window functions. If unspecified, the default is RESPECT NULLS which includes NULL values in any result calculation. IGNORE NULLS ignores NULL values. … Continue reading

Seattle Postgres User Group Video Library

Posted by Jeremy Schneider on 2025-10-13 at 06:03

Are you in the Pacific Northwest?

Thursday Nov 6 at 6pm Bohan Zhang – a member of the technical staff at OpenAI – is returning to speak again at the Seattle Postgres User Group! This time, Bohan will be talking about how OpenAI runs Postgres – challenges faced, lessons learned from outages, and strategies implemented to scale PostgreSQL to handle millions of queries per second (QPS).
Seattle Systems meetup still needs meetup locations to host us. If you are a company in the Seattle area with space for an in-depth technical talk and 50-100 people, then please reach out to Sitesh (I can help get you in touch).

Since January 2024 we’ve been recording the presentations at Seattle Postgres User Group. After some light editing and an opportunity for the speaker to take a final pass, we post them to YouTube. I’m perpetually behind (I do the editing myself) so you won’t find the videos from this fall yet – but we do have quite a few videos online! Many of these are talks that you can’t find anywhere else. We definitely love out-of-town speakers – but an explicit goal of the user group is also to be an easy place for folks here in Seattle to share what we know with each other, and to be an easy place for people to try out speaking with a small friendly group if they never have before.

https://www.youtube.com/@seattle-postgres

Date	Speaker	Title
June 12 2025	Noah Baculi	From Side Projects: Why We Chose Rust for Postgres + AI (YouTube)
May 7 2025	Gwen Shapira	Re-engineering Postgres for Millions of Tenants (YouTube)
April 10 2025	Jonathan Katz	Vectors: Best practices for a nasty data type (YouTube)

[...]

#PostgresMarathon 2-007: Should we worry about pg_blocking_pids()'s observer effect?

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-13 at 00:00

#PostgresMarathon 2-007: Should we worry about pg_blocking_pids()'s observer effect?

Many years ago, when developing complex automated procedures for a large company, I realized that my automation needs monitoring components. Including understanding heavyweight lock contention – for example, to recognize situations when a poorly designed change is blocked by things like autovacuum running in transaction ID wraparound prevention mode (it doesn't yield to anybody, when in this mode).

This led me to pg_blocking_pids() and analysis described in "Useful queries to analyze PostgreSQL lock trees (a.k.a. lock queues)".

Unfortunately, there is a catch – as the docs for pg_blocking_pids() describe:

Frequent calls to this function could have some impact on database performance, because it needs exclusive access to the lock manager's shared state for a short time.

But how bad is it?

First, let's study the code (I use PG18 sources here), it starts here. The comment to the function itself doesn't reveal extra secrets compared to the docs (though it has an interesting consideration for parallel-query cases, which is touched on in the docs too), so let's go deeper, function GetBlockerStatusData(), its comment has some words about potential observer effect:

 * The design goal is to hold the LWLocks for as short a time as possible;
 * thus, this function simply makes a copy of the necessary data and releases
 * the locks, allowing the caller to contemplate and format the data for as
 * long as it pleases.

And inside, we see this:

        /*
         * Acquire lock on the entire shared lock data structure.  See notes
         * in GetLockStatusData().
         */
        for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
            LWLockAcquire(LockHashPartitionLockByIndex(i), LW_SHARED);

So, it acquires ALL 16 partition locks (NUM_LOCK_PARTITIONS = 16, we discussed it before), using LW_SHARED. Potentially blocking attempts coming from backends to acquire

[...]

JIT: so you want to be faster than an interpreter on modern CPUs…

Posted by Pierre Ducroquet on 2025-10-12 at 19:08

Hi

Since my previous blog entry about JIT compiler for PostgreSQL, sadly not much happened due to a lack of time, but still some things were done (biggest improvement was the port to ARM64, a few optimizations, implementing more opcodes…). But I am often asking myself how to really beat the interpreter… And on “modern” CPUs, with a well written interpreter, that’s far more complicated than many would imagine. So in order to explain all this and show how I am planning to improve performance (possibly of the interpreter itself too, thus making this endeavor self-defeating), let’s first talk about…

The magics of OoO execution and super-scalar CPUs

If you already know about all the topics mentioned in this title, feel free to jump to the next section. Note that the following section is over-simplified to make the concepts more accessible.

I am writing this blog post on a Zen 2+ CPU. If I upgraded to a Zen 3 CPU, same motherboard, same memory, I would get an advertised 25% performance jump in single thread benchmarks while the CPU frequency would be only 2% higher. Why such a discrepancy?

Since the 90s and the Pentium-class CPUs, x86 has followed RISC CPUs in the super-scalar era. Instead of running one instruction per cycle, when conditions are right, several instructions can be executed at the same time. Let’s consider the following pseudo-code:

f(a, b, c, d):
  X = a + b
  Y = c + d
  Z1 = 2 * X
  Z2 = 2 * Y
  Z = Z1 + Z2
  return Z

X and Y can be calculated at the same time. The CPU can execute these on two integer units, fetch the results and store them. The only issue is the computation of Z: everything must be done before this step, making it impossible for the CPU to go further without waiting for the previous results. But now, what if the code was written as follow:

f(a, b, c, d):
  X = a + b
  Z1 = 2 * X
  Y = c + d
  Z2 = 2 * Y
  Z = Z1 + Z2
  return Z

Every step would require waiting for the previous one, slowing down the CPU terribly. Hence the most important technique used

[...]

sparql_fdw Foreign Data Wrapper Tested Against PostgreSQL 18

Posted by Stefanie Janine on 2025-10-11 at 22:00

sparql_fdw Forreign Data Wrapper

The sparql_fdw is a foreign data wrapper to connect to query web databases with the SPARQL protocol from inside PostgreSQL written in Python.

Test Against PostgreSQL 18

As multicorn2 is already working with PostgreSQL 18, I tested the sparql_fdw, too.

It worked like a charm and I have added PostgreSQL 18 to the supported versions in the README file.

In addtion I removed PostgreSQL 12 as supported version as that version is out of support.

PGConf.EU 2025 - Join us for a Postgres Women Breakfast

Posted by Karen Jex in PostgreSQL Europe on 2025-10-10 at 14:28

🥐 We're excited to invite you to the Postgres Women Breakfast, hosted by the PostgreSQL Europe Diversity Committee, during PGConf.EU 2025 in Riga, Latvia! 🇱🇻

This breakfast is a wonderful opportunity to connect with other women in the PostgreSQL community, share experiences, and build meaningful relationships in a welcoming environment. Whether you're a long-time contributor or new to the community, we'd love to have you join us!

Here are the details:

📅 Oct 23, 7-9 AM 📍 TRIBE Riga City Centre Hotel 🇱🇻 👥 Limited to 40 attendees - register early to secure your spot ⏰

Signup here.

We look forward to seeing you there for great conversations and community building over breakfast! If you have any questions, please don't hesitate to reach out.

Warm regards, the PostgreSQL Europe Diversity Committee

PostGIS Performance: Indexing and EXPLAIN

Posted by Paul Ramsey in Crunchy Data on 2025-10-10 at 14:00

I am kicking off a short blog series on PostGIS performance fundamentals. For this first example, we will cover fundamental indexing.

We will explore performance using the Natural Earth “admin0” (countries) data (258 polygons) and their “populated places” (7342 points).

alt

A classic spatial query is the “spatial join”, finding the relationships between objects using a spatial contain.

“How many populated places are there within each country?”

SELECT Count(*), a.name
FROM admin0 a
JOIN popplaces p
  ON ST_Intersects(a.geom, p.geom)
GROUP BY a.name ORDER BY 1 DESC;

This returns an answer, but it takes 2200 milliseconds! For two such small tables, that seems like a long time. Why?

The first stop in any performance evaluation should be the “EXPLAIN” command, which returns a detailed explanation of how the query is executed by the database.

EXPLAIN SELECT Count(*), a.name
FROM admin0 a
JOIN popplaces p
  ON ST_Intersects(a.geom, p.geom)
GROUP BY a.name;

Explain output looks complicated, but a good practice is to start from the middle (the most deeply nested) and work your way out.

                              QUERY PLAN
-------------------------------------------------------------------------
 GroupAggregate  (cost=23702129.78..23702145.38 rows=258 width=18)
   Group Key: a.name
   ->  Sort  (cost=23702129.78..23702134.12 rows=1737 width=10)
         Sort Key: a.name
         ->  Nested Loop  (cost=0.00..23702036.30 rows=1737 width=10)
               Join Filter: st_intersects(a.geom, p.geom)
               ->  Seq Scan on admin0 a  (cost=0.00..98.58 rows=258 width=34320)
               ->  Materialize  (cost=0.00..328.13 rows=7342 width=32)
                     ->  Seq Scan on popplaces p  (cost=0.00..291.42 rows=7342 width=32)

The query plan includes a minimum and maximum potential cost for each step in the plan. Steps with large differences are potential bottlenecks. Our bottleneck is in the “nested loop” join, which is performing the spatial join.

For each geometry in the a

[...]

Don’t Skip ANALYZE: A Real-World PostgreSQL Story

Posted by semab tariq in Stormatics on 2025-10-10 at 10:47

Recently, we worked on a production PostgreSQL database where a customer reported that a specific SELECT query was performing extremely slowly. The issue was critical since this query was part of a daily business process that directly impacted their operations.

Identifying the Problem

During execution, we noticed that whenever this query ran along with others, CPU usage spiked dramatically — sometimes reaching as high as 80% utilization. This clearly indicated that PostgreSQL was doing a lot of extra work behind the scenes, likely due to poor query planning or inefficient execution paths.

Understanding the Query

When we began debugging, the first step was to analyze the query itself. It turned out to be a large and highly complex query, joining more than six tables and referencing some of them multiple times throughout. These joins appeared in several parts of the query, making it even harder to follow.

Given its size and intricate structure, rewriting or restructuring the query wasn’t a practical option for optimization — there was a real risk of breaking the existing business logic.

Discovering the Root Cause

Upon further investigation, we found that the tables involved in the joins had never been analyzed before. This meant PostgreSQL had no up-to-date statistics about data distribution, leading the query planner to make suboptimal decisions and choose inefficient execution plans.

The Turning Point: Running ANALYZE

We decided to manually run the ANALYZE command on all tables involved in the query and then re-executed the same SELECT statement.

To our surprise, the results were remarkable:

The query, which previously failed to complete even after 10 minutes, now finished in under 20 seconds — a 96.7% improvement in performance.

The overall CPU utilization dropped from over 60% to under 10%, representing an 83% reduction in system load.

Why This Happened

The root cau

[...]

#PostgresMarathon 2-006: Mysterious max_locks_per_transaction

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-10 at 00:00

The setting max_locks_per_transaction is mysterious, it is a good illustration of Socrates' "I know that I know nothing". This is the main fact to memorize about max_locks_per_transaction. Don't try to remember details. Unless you touch it often, you'll forget (I do). Instead, let's rely on the docs:

The shared lock table has space for max_locks_per_transaction objects (e.g., tables) per server process or prepared transaction; hence, no more than this many distinct objects can be locked at any one time. This parameter limits the average number of object locks used by each transaction; individual transactions can lock more objects as long as the locks of all transactions fit in the lock table. This is not the number of rows that can be locked; that value is unlimited. The default, 64, has historically proven sufficient, but you might need to raise this value if you have queries that touch many different tables in a single transaction, e.g., query of a parent table with many children. This parameter can only be set at server start.

When running a standby server, you must set this parameter to have the same or higher value as on the primary server. Otherwise, queries will not be allowed in the standby server.

Unless you read academic papers every day, you'll need a few minutes to understand what's written here.

Let's take even more time and understand every single detail, and also cover what's not covered in this description.

By the end of this writeup, I want us to fully understand the mechanics of lock limitation – because this is one of a very unpleasant situations, when you bump into the main lock table limit and see:

ERROR:  out of shared memory
HINT:  You might need to increase "max_locks_per_transaction".

One might be very confused by "out of shared memory", especially on a server with a terabyte of RAM, 25% of which went to shared_buffers – the thing here is that this confusing "out of shared memory" is not about the buffer pool at all! This

[...]

Contributions for week 40, 2025

Posted by Boriss Mejias in postgres-contrib.org on 2025-10-09 at 10:59

PGConf NYC took place from September 29th to October 1st, 2025, at the Convene 117 West 46th Street, in New York City (no surprises here with the name of the city). A three-day event obviously generates a long list of contributions, and here we display it for you.

Organizers:

Chelsea Dole
Jonathan Katz
Mark Wong
Michael Alan Brewer
Mila Zhou
Pat Wright

Talk Selection Committee:

Chelsea Dole
Daniel Gustafsson
Jonathan Katz

Volunteers:

Amit Varde
Anita Singh
Audrey Evergreene
Beihao Zhou
Boriss Mejias
Bram Adams
Chelsea Dole
Christine Momjian
Dave Cramer
Devrim Gündüz
Donald Wong
Emma Saroyan
Greg Burd
Javidan Karimli
Jeff Davis
Joe Conway
Jonathan Morin
Joseph Koshakow
Justin I
Katharine Saar
Mark Wong
Miaolai Zhou
Michael Brewer
Pat Wright
Ramil Mammadov
Renee Huinker
Rinisha Marar
Ryan Booz
Trisha Harvey

Speakers:

Claire Giordano
Magnus Hagander
Robert Treat
Burak Yucesoy
Ankit Mittal
Monica Sarbu
Beihao Zhou
Guanqun Yang
Andrew Atkinson
Burak Yucesoy
Keith Fiske
Nelson Calero
Robert Bernier
Domenico di Salvia
Bohan Zhang
Zhou Sun
Nick Canzoneri
Dwarka Rao
Jay Miller
Andres Freund
Tim Steward
Bonny P McClain
Ivan Sayapin
Yu Lung Law
Nicholas Meyer
Alastair Turner
Ryan Booz
Sameer Kumar
Luigi Nardi
Joaquim Oliveira
Pilar de Teodoro
Jonathan Hinds
Jelte Fennema-Nio
Shane Borden
Gleb Otochkin
Christophe Pettus
Norberto Leite
Melanie Plageman
Chirag Dave
Sami Imseih
Vibhor Kumar
Simon Pane
Alex Francoeur
Alisdair Owens
Andrei Dukhounik
Ryan Booz
Bruce Momjian
Amit Kapila
Boriss Mejias
Tom Kincaid

Lightning Talk Speakers:

Mila Zhou
Ryan Booz

[...]

Exploration: CNPG PostgreSQL Upgrade

Posted by Umut TEKIN in Cybertec on 2025-10-09 at 06:00

Introduction

Every year, we get a new PostgreSQL major release, each bringing performance improvements, new features, and extended compatibility. While minor version upgrades are straightforward and backward-compatible, major upgrades require careful handling, as the internal storage format may change.

CloudNativePG (CNPG) had offered only limited upgrade capabilities until version 1.26. Starting with version 1.26, CNPG introduced offline in-place upgrades as an option for performing major upgrades, alongside other methods such as logical dump/restore and logical replication.

With CNPG v1.26, performing a major upgrade has become a declarative, Kubernetes-native process, making it both safe and repeatable.

We will walk through how to perform both minor and major upgrades using CNPG.

Minor Version Upgrade

PostgreSQL introduces bug fixes and security enhancements with every minor release. This is why each minor version is compatible with other minor versions of the same major version.

In order to perform a minor version upgrade, CNPG performs a rolling upgrade for replicas first and then upgrades the primary. This can be completed by performing a switchover to a replica or restarting the primary.

Deploy a PostgreSQL 16.0 Cluster

We start with a cluster running PostgreSQL 16.0.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: minor-upgrade
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:16.0
  storage:
    size: 1Gi

After applying the manifest, we can confirm the version and check its status:

kubectl cnpg status minor-upgrade

System ID: 7556540389940686868
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:16.0
Primary instance: minor-upgrade-1
Primary start time: 2025-10-02 08:36:08 +0000 UTC (uptime 5m37s)
Status: Cluster in healthy state
Instances: 3
Ready instances: 3

Update the Image for 16.10

To perform the upgrade, we change only the imageName in the cluster manifest:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster

[...]

#PostgresMarathon 2-005: More LWLock:LockManager benchmarks for Postgres 18

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-09 at 00:00

In 2023-2024, after incidents that multiple customers of PostgresAI experienced, when production nodes were down because of LWLock:LockManager contention, we studied it in synthetic environments.

At that time, we managed to reproduce the issue only on large machines – ~100 or more vCPUs.

With PG18 release, this question started to bother me again: can we experience LWLock:LockManager on smaller machines?

Denis Morozov just published results of benchmarks that successfully reproduce LWLock:LockManager contention in PG18 on 16-vCPU VMs.

As before, we took standard pgbench, with -s100 (no partitioning), and started running the standard "--select-only" workload that SELECTs random rows in "pgbench_accounts". Originally, the table has only 1 index – so at planning time, Postgres locks 2 relations with AccessShareLock; then we add one more index at each step.

Then we change max_locks_per_transaction from default 64 down to 32 and up to 1024. We couldn't set it to 16 (which, I expect, would make behavior of PG18 similar to PG17 in terms of fast-path locking – 16 slots only), because in our setup, Postgres is managed by Patroni, and there, as it turned out, there is a hardcoded minimum for max_locks_per_transaction, 32.

And here is what we have for the planning time:

Planning time latency vs number of indexes with different max_locks_per_transaction values in PG18

-- it is clear that when we're out of fast-path lock slots and Lock Manager starts using the main lock table, it affects the planning time. The infamous LWLock:LockManager performance cliff, reproduced on a moderately sized machine.

For max_locks_per_transaction=1024, the issue wasn't observed, with the number of extra indexes up to 200 (total number of locked relations 202).

The wait event analysis clearly confirms the LWLock:LockManager contention:

Wait event analysis showing LWLock:LockManager contention with different max_locks_per_transaction values

We plan to conduct this benchmark for PG17 to have a clear comparison, and then, perhaps, revisit smaller machines, e.g., with 8 vCPUs – I'm very curious to understand if smaller Postgres instances can experience this type of problem.

Another interesting o

[...]

Loading The Titanic Passenger Data Into PostgreSQL With DBeaver Part 1

Posted by Dave Stokes on 2025-10-08 at 14:13

The Sinking of the RMS Titanic in 1912 has shocked and entertained for over a century. I will admit to being shocked to find a GitHub Repo with a CSV formatted file with the passenger list. This file is an interesting glimpse into the lives of the folks who sailed on the ill fated ship so long ago. And interesting datasets are always good for examples. I had been looking for a project to show off what I call 'Lifting and Shifting' but others call ETL.

At the PG NYC 2025 Conference, Ryan Booz had a session title Transforming Data with the Power of PostgreSQL and SQL. For many years, ETL was considered as the way to move data from one source to another Ryan argued that it should be ELT, and rather convincingly.

ETL?

ETL is a three-step data integration process (Extract, Transform, Load) used to collect raw data from various sources, process it into a consistent, usable format, and then load it into a central data warehouse or database for analytics and business intelligence.

Spreadsheet conversions are a fairly common task and DBeaver is a slick way to make the change.

ELT?

The 'T' is often the tricky part. Databases have a lot of tools to manipulate data. So why put off using the power of the database until the end? What if we loaded the data into a database as an intermediate step and used the power of the database to transform the data into a data final form? This seems like an obviously better practice. So how do we do it?

Peek At The Data

Using the above link, I down loaded the CSV file of the passenger list. Using the DBeaver Community Edition, I used File -> Open to access the titanic.csv file. This was done to investigate the file only and does not actually do anything to import the data.

The data can be viewed at this point. A quick review shows nothing out of the ordinary. Foreshadowing - there was something wrong. I like to perform this step just to make sure there is nothing obviously wrong with the data

[...]

#PostgresMarathon 2-004: Fast-path locking explained

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-08 at 00:00

After 2-003, @ninjouz asked on X:

If fast-path locks are stored separately, how do other backends actually check for locks?

The answer reveals why fast-path locking is so effective - and why PG 18's improvements matter so much in practice. // See lmgr/README, the part called "Fast Path Locking".

Remember from 2-002: when you SELECT from a table, Postgres locks not just the table but ALL its indexes with AccessShareLock during planning. All of these locks go into shared memory, protected by LWLocks. On multi-core systems doing many simple queries (think PK lookups), backends constantly fight over the same LWLock partition. Classic bottleneck.

Instead of always going to shared memory, each backend gets its own private array to store a limited number of "weak" locks (AccessShareLock, RowShareLock, RowExclusiveLock).

In PG 9.2-17: exactly 16 slots per backend, stored as inline arrays in PGPROC -- each backend's 'process descriptor' in shared memory that other backends can see, protected by a per-backend LWLock (fpInfoLock).

No contention, because each backend has its own lock.

We can identify fast-path locks in "pg_locks", column "fastpath".

Weak locks on unshared relations – AccessShareLock, RowShareLock, and RowExclusiveLock – don't conflict. Typical DML operations (SELECT, INSERT, UPDATE, DELETE) take these locks and happily run in parallel. While DDL operations or explicit LOCK TABLE commands create conflicts, so locks for them never go to the fast-path slots.

For synchronization, Postgres maintains an array of 1024 integer counters (FastPathStrongRelationLocks) that partition the lock space. Each counter tracks how many "strong" locks (ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock) exist in that partition. When acquiring a weak lock: grab your backend's fpInfoLock, check if the counter is zero. If yes, safely use fast-path. If no, fall back to main lock table.

Therefore, other backends don't normally need to check fast-path locks - they on

[...]

How it all began - 25 years of CYBERTEC

Posted by Hans-Juergen Schoenig in Cybertec on 2025-10-07 at 05:00

This year, CYBERTEC is celebrating 25 years since its founding in 2000 by our CEO Hans-Jürgen Schönig. In light of this, we have decided to hold an interview with the man himself to get some insights into how the company started and where it might be going. In addition to this, Hans-Jürgen made some interesting points about leading a business and the importance and future of PostgreSQL as a whole, so buckle up and prepare for some glimpses into the thought processes of an accomplished entrepreneur in our 2-part blog posting.

The beginnings of CYBERTEC

For our first few questions, we wanted to focus on how the company started out and the key decisions that led to the foundation of what we are today.

Interviewer: Why did you want to start a company?

Hans: Why did I want to start a company? That's a tough question. And actually, it has a simple answer: I never wanted to have a boss, you know? [laughs] Yeah, that's the simple reason. Just never wanted to have a boss. I always wanted to do my thing and I always thought that, well, you can do this in a better way, right? But the core idea, I think, is, I just never wanted to have a boss. And that's what it's all about. It's a freedom thing, I would say.

Interviewer: I see. So, what was your core idea, your vision?

Hans: Core idea? I simply liked the technology. I think it was that simple. I mean, back at that time, I was 22 years old. I think everybody who would say that there is a major vision of “we will change the world”, whatever, would be fairly dishonest. I just liked the technology and it happened to be a good thing, you know? A grand vision exists now, but I think back then I just thought, this is cool technology. I think that was the core motivation, I would say.

Interviewer: I see. So, talking about another vision. If you think back to then, where did you want to be in 2025? Have you reached this goal now?

Hans: I think when we started, the longest period in my head was something more like five years. I believe that ever

[...]

Latest Blog Posts

Posted by Sergey Solovev on 2025-10-18 at 15:27

Creating initial files

Initial code

Posted by Mayur B. on 2025-10-17 at 22:57

🗓️ Wednesday, Oct 22 — Warming Up the Buffers

🧱 Thursday, October 23 — The Day of Observability and Enlightenment

Posted by David Christensen in Crunchy Data on 2025-10-17 at 12:00

Posted by Umair Shahid in Stormatics on 2025-10-17 at 10:28

1. When to use huge pages and why they matter

Posted by Robert Haas in EDB on 2025-10-16 at 13:36

Posted by Jeremy Schneider on 2025-10-16 at 03:57

Posted by Henrietta Dombrovskaya on 2025-10-16 at 03:04

Posted by Jan Wieremjewicz in Percona on 2025-10-15 at 11:00

The short answer:

The slightly longer one:

Posted by warda bibi in Stormatics on 2025-10-15 at 10:12

What is Disaster Recovery?

The Pillars of Business Continuity: RTO and RPO

Posted by Shaun Thomas in pgEdge on 2025-10-15 at 04:14

Breaking Down the AI Buzzwords

Understanding Vectors: The School of Fish

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-14 at 23:59

Using prepared statements to reduce locking​

Posted by Shane Borden on 2025-10-14 at 22:17

Posted by Bruce Momjian in EDB on 2025-10-14 at 13:00

Posted by Cornelia Biacsics in Cybertec on 2025-10-14 at 05:41

About villages and the PostgreSQL village

How I Found My Place in the Village

Posted by Karen Jex in Crunchy Data on 2025-10-13 at 19:13

Tuesday

Wednesday

Posted by Hubert 'depesz' Lubaczewski on 2025-10-13 at 11:13

Posted by Jeremy Schneider on 2025-10-13 at 06:03

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-13 at 00:00

Posted by Pierre Ducroquet on 2025-10-12 at 19:08

The magics of OoO execution and super-scalar CPUs

Posted by Stefanie Janine on 2025-10-11 at 22:00

sparql_fdw Forreign Data Wrapper

Test Against PostgreSQL 18

Posted by Karen Jex in PostgreSQL Europe on 2025-10-10 at 14:28

Posted by Paul Ramsey in Crunchy Data on 2025-10-10 at 14:00

Posted by semab tariq in Stormatics on 2025-10-10 at 10:47

Identifying the Problem

Understanding the Query

Discovering the Root Cause

The Turning Point: Running ANALYZE

Why This Happened

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-10 at 00:00

Posted by Boriss Mejias in postgres-contrib.org on 2025-10-09 at 10:59

Posted by Umut TEKIN in Cybertec on 2025-10-09 at 06:00

Introduction

Minor Version Upgrade

Deploy a PostgreSQL 16.0 Cluster

Update the Image for 16.10

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-09 at 00:00

Posted by Dave Stokes on 2025-10-08 at 14:13

ETL?

ELT?

Peek At The Data

Posted by Nikolay Samokhvalov in Postgres.ai on 2025-10-08 at 00:00

Posted by Hans-Juergen Schoenig in Cybertec on 2025-10-07 at 05:00

The beginnings of CYBERTEC

Top posters

Top teams

Feeds

Planet

Contact

Using prepared statements to reduce locking