Latest Blog Posts

Two projects, one mission - hackorum and pginbox join forces
Posted by Kai Wagner in Percona on 2026-05-12 at 11:15

Last week, Zsolt and I jumped on a call with someone who had been building something remarkably similar to what we had been working on, completely independently. That someone is Jack Bonatakis, the creator of pginbox.dev, and that call turned into one of the most energizing conversations we’ve had since launching hackorum.dev.

Two builders, one problem

When we launched Hackorum back in January, the goal was simple but important: make the pg-hackers mailing list actually readable. The list is the heartbeat of PostgreSQL core development, patches are proposed, debated, iterated on, and committed entirely through it. But the interface? Decades-old email threads. Dense, fast-moving, and not exactly welcoming to newcomers or even experienced contributors trying to manage the volume.

All Your GUCs in a Row: autovacuum_worker_slots
Posted by Christophe Pettus in pgExperts on 2026-05-12 at 01:00
PostgreSQL 18 splits autovacuum configuration to finally let you tune worker concurrency without restarting.

What’s New in pg_clickhouse
Posted by David Wheeler on 2026-05-11 at 20:24

Bit of a news catchup on the pg_clickhouse project.

What’s New

First up, a couple weeks ago the ClickHouse Blog published What’s New in pg_clickhouse, in which I covered various improvements to the extension:

We’ve been gratified by the community reception of pg_clickhouse, the extension to query ClickHouse databases from Postgres. Recent uptake generated a ton of feedback, which we’ve been diligently addressing in the last few releases. These changes follow our constant mantra for pg_clickhouse: pushdown, pushdown, pushdown! Let’s take a quick tour.

It includes working pushdown examples for JSONB accessors, SQL value functions like CURRENT_TIMESTAMP, array functions like array_cat() and array_to_string(). It wraps with a demonstration of HTTP result set streaming, with a nice bar char for the before and after (spoiler: pg_clickhouse’s http driver became far more memory-efficient).

v0.3.0

But that’s not all. Today we released pg_clickhouse 0.3.0. Nothing drives improvements like customer issues, and v0.3.0 features a slew of them, including:

  • Mapping for the ClickHouse JSON type to the PostgreSQL JSONB type in the binary driver; it was already supported for the HTTP driver.

  • Support for mapping the Postgres JSON type to the ClickHouse JSON type. In general JSONB better matches ClickHouse JSON semantics, but we wanted to support the obvious alternative.

  • Pushdown for the Postgres to_char(timestamp[tz], fmt) function to the ClickHouse formatDateTime() function for formats that map to binary-compatible equivalents: YYYY, MM, DD, DDD, HH24, HH12, HH, MI, SS, Q, Mon, Dy, AM/PM, plus lowercase variants.

  • Support for pushing down functions from the new re2 extension, which provides ClickHouse-compatible RE2-backed regular expression functio

[...]

SSL in PostgreSQL
Posted by SHRIDHAR KHANAL in Stormatics on 2026-05-11 at 15:09

A beginner’s guide to encrypting your database connections

“’SSL is enabled’ and ‘SSL is actually working’ are two very different things.”

1. What is SSL, and why does a database need
it?

SSL stands for Secure Sockets Layer. Its successor is TLS (Transport Layer Security), but in the PostgreSQL world, and in most documentation, people still call it SSL out of old habit. Don’t let that confuse you. When someone says “SSL” in a Postgres context, they mean modern TLS-based encryption.

Here’s the problem it solves. By default, when your application connects to PostgreSQL, everything travels across the network in plain text. Usernames. Passwords. Every query you run. Every row of data that comes back. If anyone can intercept that traffic, someone on the same network, a compromised internal service, they can read all of it. A basic packet sniffer is enough. No special skills needed.

SSL wraps that connection in encryption before any data is exchanged. What travels on the wire becomes unreadable noise to anyone who doesn’t hold the session keys.

ℹ Note:  Even inside a private network or VPC, this matters. And the “it’s an internal network” line doesn’t protect you from lateral movement attacks, where an attacker is already inside the perimeter

2. How does SSL actually work?

When a client connects to PostgreSQL with SSL, before any database traffic is exchanged, this sequence happens: The client opens a plain TCP connection and signals it wants SSL. The server sends its certificate — a signed document that proves the server’s identity and contains its public key. The client checks whether that certificate was signed by a Certificate Authority it trusts, and whether the hostname in the certificate matches what it is connected to. If both checks pass, both sides negotiate a cipher and derive a shared session key. After that, all PostgreSQL traffic — authenticat

[...]

The wal_level You Set Is Not the wal_level You Get
Posted by Christophe Pettus in pgExperts on 2026-05-11 at 15:00
PostgreSQL 19 finally lets wal_level adapt dynamically to your actual replication slots, eliminating the always-on WAL cost of logical standby insurance.

Making JSONB More Queryable with Generated Columns
Posted by Richard Yen on 2026-05-11 at 06:00

Introduction

Over the past year, I’ve worked in a handful of contexts managing large volumes of data stored as JSONB in PostgreSQL. The scenario is common: users appreciate the flexibility of a document-oriented storage model, avoiding the need to predefine schemas or constantly migrate table structures as their data requirements evolve. JSONB documents can be deeply nested with numerous optional fields, and they scale to hundreds of kilobytes per record without issue. However, when the time comes to query these documents – filtering by user ID, event type, timestamps, or nested action properties – the queries can become slow and/or cumbersome to work with.

The problem I want to address is: “How do we make searching JSONB data more efficient without breaking apart our documents or forcing it into columns in a relational database?” There are several approaches available in Postgres, each with different tradeoffs. I hope to shed some light on those approaches in this article.

The Setup

I created a basic, no-frills table for the sake of this test:

CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,
    data JSONB NOT NULL
);

Here's the document shape I used for testing and writing this post -- it's representative of the event logs and audit trails I've encountered: a mix of primitive fields, nested objects, and metadata that accumulates over time.

-- Representative JSONB document
{
  "user_id": 5234,
  "event_type": "event_42",
  "timestamp": 1712341200,
  "session_id": "sess_abc123...",
  "ip_address": "192.168.1.42",
  "action": {
    "type": "click",
    "target_id": 87654,
    "coordinates": {"x": 512, "y": 768},
    "duration_ms": 1234
  },
  "device": {
    "type": "mobile",
    "os": "iOS",
    "screen_width": 1920,
    "screen_height": 1080
  },
  "performance": {
    "page_load_time": 1234,
    "dns_lookup": 123,
    "tcp_connection": 234,
    "server_response": 876
  },
  "custom_fields": { ... }
}

The queries that matter are straightforward equality and range filters on known

[...]

All Your GUCs in a Row: autovacuum_work_mem
Posted by Christophe Pettus in pgExperts on 2026-05-11 at 01:00
autovacuum_work_mem sets the maximum memory each autovacuum worker may use for tracking dead tuple identifiers (TIDs) during a vacuum. Default is -1, which means “inherit from maintenance_work_mem.” Context is sighup. The parameter exists so that autovacuum’s memory consumption can be tuned indep…

All Your GUCs in a Row: autovacuum_vacuum_scale_factor and autovacuum_vacuum_threshold
Posted by Christophe Pettus in pgExperts on 2026-05-10 at 01:00
Autovacuum's most powerful tuning lever: the scale factor that determines when dead tuples trigger a vacuum. On large tables, the 20% default waits too long.

Strong views on PostgreSQL VIEWs
Posted by Radim Marek on 2026-05-10 at 00:00

VIEWs should be the cleanest abstraction SQL, and therefore Postgres, has on offer. I love the concept. The promise of decoupling logical intent from physical storage is perfect on paper. In practice, few things in the database world trigger such a heated debate or carry as much historical baggage. VIEWs mix big promises with false hopes, and the promises rarely survive contact with production.

The appeal is straightforward. Abstract "active customer" once and reuse it everywhere. Every query, report and dashboard uses the same definition. The "active customer" then becomes the foundation of a "customer orders" view, which in turn powers an operational "customer summary" view.

-- layer 1: who counts as an active customer?
CREATE VIEW active_customers AS
SELECT c.*
FROM customers c
WHERE c.deleted_at IS NULL
  AND c.status = 'active'
  AND c.last_login_at > now() - interval '90 days';

-- layer 2: active customers with their recent orders
CREATE VIEW customer_orders AS
SELECT
    ac.*,
    o.id         AS order_id,
    o.total_cents,
    o.created_at AS ordered_at,
    o.status     AS order_status
FROM active_customers ac
LEFT JOIN orders o ON o.customer_id = ac.id
WHERE o.created_at > now() - interval '12 months'
   OR o.created_at IS NULL;

-- layer 3: one row per customer, ready for the dashboard
CREATE VIEW customer_summary AS
SELECT
    co.id,
    co.email,
    co.name,
    COUNT(co.order_id)                                   AS orders_12mo,
    COALESCE(SUM(co.total_cents), 0)                     AS revenue_12mo_cents,
    MAX(co.ordered_at)                                   AS last_order_at,
    COUNT(*) FILTER (WHERE co.order_status = 'refunded') AS refunds_12mo
FROM customer_orders co
GROUP BY co.id, co.email, co.name;

Each layer has one job. "Active customer" is defined exactly once - if marketing changes the ninety-day rule tomorrow, it is one line in one place, and the dashboard query collapses to SELECT * FROM customer_summary WHERE id = $1.

VIEWs also have the potential to be a real se

[...]

All Your GUCs in a Row: autovacuum_vacuum_max_threshold
Posted by Christophe Pettus in pgExperts on 2026-05-09 at 01:00
PostgreSQL 18 finally fixes the autovacuum formula that left billion-row tables waiting for 200M dead tuples.

A Field Guide to Alternative Storage Engines for PostgreSQL
Posted by Christophe Pettus in pgExperts on 2026-05-08 at 17:30
Six years after PostgreSQL shipped the table access method API, the alternative storage engine ecosystem is thriving—but messier than early predictions…

pg_lake vs Lakebase: Two Very Different Things Called “Postgres + Lakehouse”
Posted by Christophe Pettus in pgExperts on 2026-05-08 at 15:00
Snowflake's pg_lake and Databricks' Lakebase both wrap PostgreSQL for lakehouse workloads, but they're nearly opposite architectures.

No Compiler Required: Writing SQL-Only Postgres Extensions
Posted by Shaun Thomas in pgEdge on 2026-05-08 at 12:11

Recently at Postgres Conference 2026 in San Jose, I presented a talk called Let's Build a Postgres Extension! Since that entire presentation was primarily focused on writing a C extension while exploring the Postgres source code, I only mentioned pure SQL extensions as an aside. But what's more likely in the Postgres community in general: C devs, or people who know SQL?It turns out that you can do a lot with functions, triggers, views, tables, and various other Postgres-native capabilities. The extension system doesn't care whether the contents are compiled C or plain SQL. It just wants a control file, a SQL script, and an optional  to help with installation.So let's build a relatively trivial extension article entirely in SQL.

What Do We Want?

First things first: we need a plan. What should this extension actually do? I wrote about blocking DDL a while back with a C extension, so why not revisit that example with SQL?This being pure SQL, there are other handy elements we can add with very little effort, so how about:
  • A setting to enable or disable the extension.
  • A setting to allow or block superusers from executing DDL.
  • A role that allows members to bypass the DDL restriction.
  • A function to add users to the bypass role.
  • A function to remove users from the bypass role.
  • A view to see which users are in the bypass role.
  • An
  • event trigger
  •  to actually block DDL attempts.
Rather than a simple event trigger to prevent DDL execution, we are building a kind of DDL execution management suite. That should hopefully demonstrate just how capable a purely SQL implementation can be.

Three Files and a Dream

Every Postgres extension, regardless of complexity, boils down to the same basic structure:
  • A control file to describe the extension.
  • A SQL script to create the tables, views, functions, etc.
  • An optional Makefile to copy the SQL script and control file to the right place. Unlike a C project, there's no build step for a SQL-only extension bec
[...]

All Your GUCs in a Row: autovacuum_vacuum_insert_scale_factor and autovacuum_vacuum_insert_threshold
Posted by Christophe Pettus in pgExperts on 2026-05-08 at 01:00
PostgreSQL 13 added insert-triggered autovacuum to solve a critical problem: append-only tables never vacuumed, breaking index-only scans and delaying tuple…

Tracing PostgreSQL Using eBPF and Hardware Breakpoints
Posted by Jan Kristof Nidzwetzki on 2026-05-08 at 00:00

Hardware breakpoints can trigger eBPF programs when specific memory addresses are accessed, leveraging CPU hardware support for low overhead. By utilizing these hardware breakpoints, we can efficiently monitor PostgreSQL’s internal variable updates, such as transaction ID generation and OID assignment. In this post, we will discuss what hardware breakpoints are, whether they have less overhead than uprobes, and how to answer questions like “How many transactions are being executed per second?” or “Which backend is consuming the most OIDs?” with bpftrace.

In a previous blog post, I discussed how to use eBPF, uprobes/uretprobes, and bpftrace to monitor PostgreSQL’s internal functions, such as vacuum. uprobes and uretprobes trigger eBPF code in the Linux kernel when a function in user space is entered or exited. Even though uprobes and uretprobes have very low overhead, they still require instrumenting the function entry or exit with a software interrupt. That overhead is especially relevant for functions that are called very frequently. In contrast, hardware breakpoints use CPU hardware features to monitor specific memory addresses and trigger a real hardware interrupt when the monitored address is accessed. Therefore, they also let us catch all updates to a specific variable, even if it is updated in multiple functions, without instrumenting every function that touches it.

How Uprobes Work Under the Hood?

Uprobes and uretprobes instrument the function entry or exit by replacing the first few instructions with a software (int3) interrupt. When the function is called, the CPU executes the software interrupt, triggering a CPU mode switch that enables the eBPF program to run.

When the eBPF program finishes, the kernel needs to execute the instruction that was replaced with int3. This is called out-of-line execution and requires the kernel to run the original instruction separately, which adds additional overhead.

The instruction replacement can be observed in gdb by inspecting the first few byte

[...]

The Maintainer Is Not the Owner
Posted by Christophe Pettus in pgExperts on 2026-05-07 at 17:30
When a maintainer rewrites a project with AI and changes its license, they've crossed a line.

pgEdge Control Plane Adds Supporting Services and a Preview of systemd Support
Posted by Antony Pegg in pgEdge on 2026-05-07 at 17:20

Most Postgres management tools ask you to pick a lane. You can manage databases, or you can manage the services around them. You can run in containers, or you can run on bare metal. You get one deployment model, one operational surface, one set of assumptions about how your infrastructure works.The pgEdge Control Plane just added two features that refuse to pick a lane: Supporting Services and systemd Support. Together, they push the Control Plane into territory that, as far as we can tell, nobody else in the Postgres world is covering. Supporting Services is fully available, while the systemd support is currently a Preview feature.

Supporting Services: More Than Just Postgres

Here's the thing about enterprise Postgres in 2026: the database is only part of the story. Your AI agents need an MCP server to talk to the data, your applications need a REST API to query it, and your knowledge base needs a RAG server to index and retrieve from it. These services aren't optional extras, they're what make the database useful in production.Until now, you managed those services separately: different deployment pipelines, different configuration, different credentials, different monitoring. The database lived in one world and the services that depended on it lived in another, even though they're fundamentally coupled. When the database moves, the services need to follow, and when credentials rotate, every connected service needs to know about it. When you scale out, everything needs to come along for the ride.Supporting Services in the Control Plane fixes this by treating the database and its surrounding services as a single declarative unit. You add a array to the same JSON spec you already use for your database, and the Control Plane handles deployment, credential provisioning, health checking, and lifecycle management for everything together.That's a two-node distributed database with Spock multi-master replication, an MCP server for AI agent access on the US node, and PostgREST instances on both nodes for REST API [...]

Eight Bytes Is the Easy Part
Posted by Christophe Pettus in pgExperts on 2026-05-07 at 15:00
PostgreSQL 19 expands MultiXactOffset to 64 bits, eliminating a real outage failure mode. So when do regular transaction IDs get the same treatment?

You have a Patroni leader election. You are only halfway to PostgreSQL high availability.
Posted by Umair Shahid in Stormatics on 2026-05-07 at 10:40

A PostgreSQL primary loses power at 2am. Writes resume in under thirty seconds. The on-call engineer reads the alert in the morning, sees that the cluster healed itself, and goes back to coffee. That is the outcome PostgreSQL high availability is supposed to deliver.

A working Patroni cluster, on its own, gets you partway there. The leader election runs. A standby gets promoted. The cluster state in etcd stays consistent. Then the application keeps trying to reach an IP address that points at the wrong node now, the old primary needs a manual rejoin, and the on-call engineer is on a conference bridge instead of in bed.

I have seen this pattern enough times to call it the default. The cluster does its job. The application waits on a human. The runbook comes out. RTO passes the SLA. Everyone agrees afterward that “we should look at HA more seriously.”

The arithmetic of recovery time

The case for automation is mostly arithmetic.

When the cluster heals itself, the RTO clock starts at the failure detection and stops at the application’s first successful write. With Patroni’s TTL set to 30 seconds, a routing layer that follows promotion within another second or two, and an application that retries with backoff, the whole sequence finishes in under a minute. Often well under.

Bring a human into the loop, and a different clock starts. The monitoring system needs to detect the failure, group it into an alert, and deliver it to the on-call engineer’s pager. That alone is often 30 to 60 seconds. The engineer needs to wake up, find a laptop, log into the bastion, and load enough context to know what is happening. Even for a sharp on-call engineer ready at the keyboard, that is 5 to 10 minutes of best-case effort. Then comes the investigation: which node failed, what state the cluster is in, what is safe to do next. That is another 5 to 30 minutes depending on how clean the runbook is and how confident the engineer f

[...]

PG DATA 2026: The talks I am most excited about. Part 4 (the last one!)
Posted by Henrietta Dombrovskaya on 2026-05-07 at 02:08

That’s the last post of the series about the talks at the upcoming PG DATA 2026 conference, covering the remaining Friday talks.

Part 1

Part 2

Part 3

First, I wanted to mention two more talks presented by PG DATA organizers: Comparing Apples to Oranges with Postgres’ Type System by Dian Fay and Master Upgrading PostgreSQL, Using Real World stories and examples by Pat Wright. Dian’s talk is about Postgres types, and I can’t say enough how much I love the ability to create new types! Probably even more than I love Postgres extensions! Pat’s talk is about real-life upgrade stories, and although we know way too well that our own upgrades will present us with our unique challenges, it’s still worth learning from other people’s experience 🙂

Yet another real-life story is Apoorv Garg’s Electric SQL: Local -first Architecture. Building mobile applications is not something I am familiar with, but it looks like developers had to face familiar problems of reliability, performance, and security, which become especially challenging in the situation of a disappearing network.

Egor Tarasenko’s presentation, Streamlining Data Ingestion and Transformation with Trino + dbt, addresses a well-known problem: handling DDL changes in the source when they are not promptly communicated to the streaming process. I’ve seen multiple solutions to this problem, but none of them appeared to be perfect, so I’m very interested to hear Egor’s perspective.

Several presentations will address understanding and monitoring query execution. First is Alfredo Rodriguez’s presentation How to understand EXPLAIN without dying in the attempt. I remember Alfredo presenting at PG Day Chicago for the first time, and I know he is happy to be back with his by now well-known presentation. Then comes Mohsin Ejaz’s “Why your PostgreSQL tuning guide might be wrong (and what to do about it),” in which he shares DBTune’s perspective. And finally, Postgres plan monitoring and management in practice by Lukas Fittl. I am a great admirer of Lu

[...]

All Your GUCs in a Row: autovacuum_naptime, autovacuum_vacuum_cost_delay, autovacuum_vacuum_cost_limit
Posted by Christophe Pettus in pgExperts on 2026-05-07 at 01:00
Three autovacuum parameters control how often PostgreSQL vacuums, how hard it works, and how long it pauses.

Christophe’s Seven Rules of Disaster Reponse
Posted by Christophe Pettus in pgExperts on 2026-05-06 at 23:00
When your database catches fire, panic is optional. Learn seven battle-tested rules that turn chaos into a coordinated response.

MultiXact Members at 64 Bits: One Less Wraparound to Worry About
Posted by Christophe Pettus in pgExperts on 2026-05-06 at 19:00
PostgreSQL 19 eliminates the 32-bit MultiXactOffset ceiling that has crashed high-concurrency FK-heavy clusters at 3 a.m.

What a Data Lake Actually Is (and why you probably don’t need one)
Posted by Christophe Pettus in pgExperts on 2026-05-06 at 15:00
Most organizations that build data lakes don't need them.

I Built Three GitHub Codespaces Walkthroughs for Our Products. Would You Use Them?
Posted by Antony Pegg in pgEdge on 2026-05-06 at 09:18

I need your feedback to either convince Marketing that I’m a genius and they should put these GitHub Codespaces Walkthroughs on our website, or to tell me I need to keep looking for different ways to make Quickstarts easier.Bi-directional logical replication is not a simple thing. It's a genuinely complicated problem, and getting it right across multiple nodes in a distributed PostgreSQL cluster is hard. That's what makes what we do at pgEdge special: we've done the hard engineering so you don't have to. Multi-master replication, conflict resolution, failover, all of it wrapped up so you can have this capability without needing to be a rocket scientist.But there's still a gap between "this product exists" and "I've actually tried it," and that gap is almost always the setup. You need multiple Postgres instances, a replication extension configured between them, and enough infrastructure to actually prove it's working. By the time you've got all of that running on your laptop, you've burned an afternoon and you haven't learned anything about distributed Postgres yet. You've learned about Docker networking.I wanted to see if I could close that gap. I’ve created three GitHub Codespaces walkthroughs, each targeting a different pgEdge product, each designed to take you from zero to a running environment without installing a single thing on your machine. The issue is that I have no idea whether developers would actually find them useful until I get some data - and that is where you, dear Reader, can help me out, just by trying them and letting me know.

Why Codespaces

GitHub Codespaces gives you a full Linux development environment in a browser tab, backed by a container running on GitHub's infrastructure. The free tier gives individual developers 120 core-hours per month, (so 60 for a 2-core, 30 for a 4-core machine) which is more than enough to run through all three of these walkthroughs multiple times. For us, the appeal was simple: if you can click a link, you can be inside a working environment in about 60 se[...]

Nordic Cool Meets Parisian Chic Vlog: Two PGDays, One Week
Posted by Pavlo Golub in Cybertec on 2026-05-06 at 07:28

Two conferences. Two cities. Two completely different personalities. And me, somewhere in the middle, trying to keep up. 😄

First stop Helsinki, March 24. Nordic PGDay 2026. The Finns are punctual, focused, and deadly serious about PostgreSQL. Talks start on time. Coffee is strong. Silence is not awkward, it is just how things are. I loved every minute of it.

Two days later Paris, March 26. pgDay Paris, the 10th edition! Same elephant, completely different atmosphere. People arrive fashionably late, conversations go long, and somehow everything still works out beautifully. C'est la vie. 

Same community, same passion for open source, but such different energy. If you ever wondered whether PostgreSQL people have a cultural identity — yes, they do. And it depends heavily on latitude. I grabbed my camera to capture both! 😅

 

The post Nordic Cool Meets Parisian Chic Vlog: Two PGDays, One Week appeared first on CYBERTEC PostgreSQL | Services & Support.

All Your GUCs in a Row: autovacuum_multixact_freeze_max_age
Posted by Christophe Pettus in pgExperts on 2026-05-06 at 01:00
Prevent MultiXact ID wraparound by controlling when autovacuum freezes old locks.

Managed Postgres, Examined: Amazon Aurora PostgreSQL
Posted by Christophe Pettus in pgExperts on 2026-05-05 at 13:00
Aurora PostgreSQL separates compute and storage, replicating redo records across three Availability Zones.

How are committers selected?
Posted by Tomas Vondra on 2026-05-05 at 10:00

At a couple recent conferences, I got to describe the process Postgres uses to select new committers/maintainers. Usually to users and developers using Postgres, but in some cases it was unclear even to experienced Postgres contributors.

The official docs are rather brief, and don’t explain various important details. Let me explain how I understand the informal process, who’s responsible for what etc.

This post is not meant to give you advice on how to become a committer, that’s a far more subjective question. Perhaps in some future post, not sure yet.

CYBERTEC's contributions to PostgreSQL 19
Posted by Christoph Berg in Cybertec on 2026-05-05 at 09:40

The window for new features in PostgreSQL 19 has closed with the Commitfest PG19-Final on April 9th. 182 patches were committed in this commitfest alone (plus more in the preceding ones). No new features are being accepted for PostgreSQL 20 yet, the git branches for 19 and 20 will likely be branched off in June. Currently the focus of the PostgreSQL community is on stabilizing PostgreSQL 19 so it is ready for release at the end of summer. If everything goes well, it will be released in September 2026.

Time to have a look at what the CYBERTEC people have been doing during the PostgreSQL 19 cycle since the PG 18 branch was split off in June 2025.

The big change: REPACK CONCURRENTLY by Antonin Houska

One of the most popular CYBERTEC open-source PostgreSQL project is pg_squeeze, written by Antonin Houska. Like PostgreSQL's built-in VACUUM FULL feature and other projects like pg_repack, it lets users compact bloated PostgreSQL tables by rewriting them into fresh tables, reclaiming any wasted storage space caused by DELETE and UPDATE operations. The downside of VACUUM FULL is that it requires an access-exclusive lock on tables, so it cannot be used while the database is being accessed by users. In contrast, pg_squeeze and pg_repack perform the operation online. The even support write operations while the table is copied over, duplicating writes to the copy. pg_repack does that the traditional way by creating a set of triggers on the table. CYBERTEC's pg_squeeze uses modern PostgreSQL mechanisms, setting up logical replication between the old and the new table for the duration of the operation. Both methods still need an access-exclusive lock at the end of the operation to swap the new table into the place of the old one, but that is a quick constant-time operation.

Some time ago, PostgreSQL committer Álvaro Herrera approached Antonin asking if CYBERTEC would be willing to donate pg_squeeze for inclusion into PostgreSQL itself. We were of course were happy to support the idea. Antonin and Álvaro put i

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.