Latest Blog Posts

CloudNativePG Recipe 1 - Setting up your local playground in minutes
Posted by Gabriele Bartolini in EDB on 2024-03-03 at 00:00

Dive into the world of running PostgreSQL in Kubernetes with CloudNativePG in this inaugural guide. Follow along as we walk you through the process of setting up a disposable local cluster using kind. Gain insights into creating PostgreSQL clusters, installing CloudNativePG, and leveraging the cnpg plugin for kubectl. Wrap up your journey by tidying up your local cluster. Whether you’re a developer or a DBA, this foundational guide provides a launchpad for your future CloudNativePG explorations with a fully open source stack.

What’s in a name? Hello POSETTE: An Event for Postgres 2024
Posted by Claire Giordano in CitusData on 2024-03-01 at 20:58

When I think about naming something—like a feature or product or even an event—this quote always comes to mind.

What’s in a name? That which we call a rose
By any other name would smell as sweet;

–William Shakespeare

What’s in a name, after all? I’m no expert on Romeo and Juliet, but friends tell me Shakespeare’s point was that names don’t matter. The thing itself is the thing itself, regardless of the name.

My parents named my sister “Helen” at birth but never actually called her that. They always called her by a nickname, “Lyena”. So my sister’s sense of self became intertwined with her nickname: she “felt” like a Lyena. And the only people that ever called her Helen were officious school principals, gate-check agents looking at her passport—and our paternal grandfather. It made her so mad. Whenever my grandfather insisted on calling her Helen, you could almost see the steam coming out of my sister’s ears.

My husband told me about a thing I’ve unconsciously done for years: whenever we drive through Suisun City en route to the mountains, I say the name of the city out loud to myself. Not just once but several times, like I’m chewing on the word. Turns out I really like the way it feels when I say “Suh-soon-si-tee” out loud.

Names carry meaning. They trigger emotions. The phonetic sound of a word affects whether you can remember it. And some words just “roll off the tongue” in a way that makes it easy to say and easy to remember. Bottom line, names matter.

Which is why we decided to give “Citus Con: An Event for Postgres” a new name. People had told us that when they heard the event’s nickname of “Citus Con” they thought it was only about Citus—and did not realize that over 66% of last year’s Citus Con talks were about Postgres, and not about Citus.

Say hello to POSETTE: An Event for Postgres, now in its 3rd year. A free and virtual developer event brought to you with 🧡 by the Postgres team here at Microsoft.

What does the “POSETTE” in “POSETTE: An Event f

[...]

PGSQL Phriday #016: Tuning That One Big Query
Posted by Ryan Booz on 2024-03-01 at 17:32
PostgreSQL continues to dominate in growth and interest by many people and teams across the globe. In my role at Redgate I’m often brought into conversations with long-time SQL Server developers and DBA’s that are now being tasked with learning Postgres. I was one of those people myself a few years ago. Aside from the ... Read more

Partitioning by reference – Oracle vs PostgreSQL
Posted by Gilles Darold in HexaCluster on 2024-02-29 at 16:09

Sometimes, when you are working on a migration to PostgreSQL, you can encounter features that do not exist in PostgreSQL. This especially happens when we do not have any extension to emulate the feature you are looking at. At HexaCluster Corp, we often face these kind of situations. When it is possible to create an […]

The post Partitioning by reference – Oracle vs PostgreSQL appeared first on HexaCluster.

List of PostgreSQL® AI Projects and Resources
Posted by Francesco Tisiot in Aiven on 2024-02-29 at 10:00

Everyone is now talking about AI, and modern databases like PostgreSQL® are increasingly being adopted in companies' AI journey as sources of data or key pieces of the AI infrastructure. Moreover there's a new set projects that are solving PostgreSQL problems with AI.

PRs are welcome!

The List of PostgreSQL® AI projects and resources is open-source project to collect PostgreSQL® extensions, applications and resources (video or blogs) talking about how our loved database can fit in the AI journey.

Preview of List of PostgreSQL® AI Projects and Resources

PostgreSQL security – Password Reuse Policy
Posted by Gilles Darold in HexaCluster on 2024-02-29 at 06:02

A database password reuse policy is a set of rules that govern the use of passwords within a database system. The policy is designed to ensure that users create strong and unique passwords, and that they do not reuse the same password. By implementing a database password reuse policy, organizations can reduce the risk of […]

The post PostgreSQL security – Password Reuse Policy appeared first on HexaCluster.

Can you use ltree for Nested Place Data?
Posted by Ryan Lambert on 2024-02-29 at 05:01

The topic of the ltree data type has come up a few times recently. This intersects with a common type of query used in PostGIS: nested geometries. An example of nested geometries is the state of Colorado exists within the United States. The PgOSM Flex project calculates and stores nested polygon data from OpenStreetMap places into a handful of array (TEXT[], BIGINT[]) columns. I decided to explore ltree to see if it would be a suitable option for PgOSM Flex nested places.

Spoiler alert: ltree is not suitable for OpenStreetMap data in the way I would want to use it.

Nested data in arrays

The following is what the "Colorado is in the U.S" would look like using a Postgres TEXT[] array:

{"United States","Colorado"}

Podcast about transitioning from developer to PostgreSQL specialist, with Derk van Veen
Posted by Ari Padilla in CitusData on 2024-02-28 at 18:07

How do you feel when your day doesn’t go as planned? In this episode of the Path To Citus Con, the podcast for developers who love Postgres, guest Derk van Veen joins co-hosts Claire Giordano and Pino de Candia to talk about his journey from Java developer to Postgres specialist.

What makes you feel alive at work? Is it the routine tasks, the predictable outcomes, the stable environment? Or is it the unexpected challenges, the unknown variables, the chaotic situations? If you are like Derk (and I), you thrive on the latter. Maybe you love to jump on tough problems and find beautiful solutions—or maybe you enjoy the thrill of finding the root cause of a slow system or some faulty code. You don't just follow a recipe. You ask questions, explore options, and experiment with different strategies. How do you partition a table? Why do you partition a table? What are the trade-offs of each approach?

In this post, you’ll find some of our favorite episode highlights and quotes. You’ll find links to where you can subscribe and listen to past episodes of the podcast at the end of the post.

Path to Citus Con Ep. 12 YouTube thumbnail
Figure 1: YouTube thumbnail for episode 12 of the Path To Citus Con, the podcast for developers who love Postgres, with (starting in the top left, listed clockwise) Derk van Veen, Claire Giordano, and Pino de Candia. The topic is “From developer to Postgres specialist.”

Highlights from the podcast episode with Derk van Veen

“It always is about the why. Why do you share this? Why do you have to tell this to this audience? Why do you have this slide? Why is this visual on your slide? It's always about the why. Every part of it.” – Derk van Veen

When talking about conferences and public speaking, Derk shares how to sell a story, meaning that every part of a presentation and what you say needs to be intentional. You need to understand why it is there and make sure it is there for the right reasons. The idea is to guide the audience along a straight line, a smooth path, without unnecessary distr

[...]

First Row of Many Similar Ones
Posted by Tobias Petry on 2024-02-28 at 14:28
SQL is a straightforward and expressive language, but it sometimes lacks constructs for writing queries in a simple way. It's more complicated than it should be to write a query to get, e.g., only the most expensive order for every customer of the current fiscal year. You can use PostgreSQL's vendor-specific DISTINCT ON feature or window functions for every other database - like MySQL.

11 Lessons to learn when using NULLs in PostgreSQL®
Posted by Francesco Tisiot in Aiven on 2024-02-28 at 10:00

A boolean value should only contain two values, True or False, but is it correct? Usually people assume so, but sometimes miss the fact that there could be the absence of the value all-together. In databases this is absence is usually stored as NULL and this blog showcases how to find them, use them properly and 11 lessons to learn to be a NULL Pro!

Keep in mind, it's not only booleans that can contain NULL values, it's all the columns where you don't define a NOT NULL constraint!

If you need a FREE PostgreSQL database?
🦀 Check Aiven's FREE plans! 🦀
If you need to optimize your SQL query?
🐧 Check EverSQL! 🐧

It all starts with some columns and rows

Let's start from the basics: you have a PostgreSQL® database and a table, called users like:

CREATE TABLE users (
    id SERIAL,
    username TEXT PRIMARY KEY,
    name TEXT,
    surname TEXT,
    age INT
);

Hey, do you want to test the above code but not having a PostgreSQL database handy? Past your code in PostgreSQL Playground and quickly check the results!

Let's insert some data:

INSERT INTO users (username, name, surname, age) VALUES
    ('jdoe', 'Jon', 'Doe', 25),
    ('lspencer','Liz', 'Spencer', 35),
    ('hlondon','Hanna', 'London', 45);

Querying the data showcases the table with all the columns filled.

 id | username | name  | surname | age 
----+----------+-------+---------+-----
  1 | jdoe     | Jon   | Doe     |  25
  2 | lspencer | Liz   | Spencer |  35
  3 | hlondon  | Hanna | London  |  45
(3 rows)

Insert NULLs

Now, let's check if we can insert some NULLs, let's try by inserting them in the name, surname and age columns:

INSERT INTO users (username, name, surname, age) VALUES ('test',NULL, NULL, NULL);

This works since we don't have any constraint

 id | username | name  | surname | age 
----+----------+-------+---------+-----
  1 | jdoe     | Jon   | Doe     |  25
  2 | lspencer | Liz   | Spencer |  35
  3 | hlondon  | Hanna | London  |  45
  4 | test     |    
[...]

Maximizing Microservice Databases with Kubernetes, Postgres, and CloudNativePG
Posted by Gabriele Bartolini in EDB on 2024-02-28 at 00:00

Explore the synergy between PostgreSQL and Kubernetes through CloudNativePG — a transformative operator discussed in this article. Discover how this powerful open source stack empowers organizations to free themselves from vendor lock-in and to create a seamless microservice database environment, enhancing innovation, operational efficiency and velocity. This article provides a refreshed perspective on “ Why Run Postgres in Kubernetes?” from 2022.

Extension Ecosystem Summit 2024
Posted by David Wheeler in Tembo on 2024-02-27 at 17:46
Logo for PGConf.dev

I’m pleased to announce that some pals and I have organized and will host the (first annual?) Extension Ecosystem Summit at PGConf.dev in Vancouver (and more, see below) on May 28:

Enabling comprehensive indexing, discovery, and binary distribution.

Participants will collaborate to examine the ongoing work on PostgreSQL extension distribution, examine its challenges, identify questions, propose solutions, and agree on directions for execution.

Going to PGConf? Select it as an “Additional Option” when you register, or update your registration if you’ve already registered. Hope to see you there!


Photo of the summit of Mount Hood

Extension Ecosystem Mini-Summit

But if you can’t make it, that’s okay, because in the lead up to the Summit, to we’re hosting a series of six virtual gatherings, the Postgres Extension Ecosystem Mini-Summit.

Join us for an hour or so every other Wednesday starting March 6 to hear contributors to a variety of community and commercial extension initiatives outline the problems they want to solve, their attempts to so, challenges discovered along the way, and dreams for an ideal extension ecosystem in the future. Tentative speaker lineup:

  • March 6: David Wheeler, PGXN: “History and Context of Extension Distribution”
  • March 20: Ian Stanton, Tembo: “Trunk”
  • April 3: Devrim Gündüz: “Overview of the yum.postgresql.org architecture, how new RPMs are added, and issues and challenges with distributing RPMed extensions”
  • April 17: TBD
  • May 1: Yurii Rashkovskii, Omnigres: “Universally buildable extensions: dev to prod”
  • May 15: David Wheeler, PGXN: “Metadata for All: Enabling discovery, packaging, and community”

Hit the event page for details. Many thanks to my co-organizers Jeremy Schneider, David Christensen, Keith Fiske, and Devrim Gündüz, as well as the PGConf.dev organize

[...]

AUTOCOMMIT – Oracle vs PostgreSQL
Posted by Akhil Reddy Banappagari in HexaCluster on 2024-02-27 at 16:30

Oracle and PostgreSQL differ significantly in their transaction models, and AUTOCOMMIT is one of those differences. We see many of our customers successfully migrate code, but still encounter variations in behavior and even runtime errors related to transaction control. Many issues happen because AUTOCOMMIT settings differ between Oracle and PostgreSQL. It is definitely necessary to […]

The post AUTOCOMMIT – Oracle vs PostgreSQL appeared first on HexaCluster.

Uncovering and Exploring ROW Constructors in PostgreSQL.
Posted by Deepak Mahto on 2024-02-27 at 13:55

A ROW expression allows you to construct ROW values, which can represent anonymous records, specific table row types, or custom composite types. Its uses include processing records within multiple expressions (using operators like =, <>, <, <=, >, or >=), evaluation with subqueries, and combining ROW values to build composite types. You can even export these to JSON format (using functions like rowtojson). Some key operations you can perform with a ROW constructor in Conversion or migration are :

  • Building custom ROW types
  • Expanding ROW types for internal processing
  • Using with functions that return multiple out params

With a ROW constructor, we can build record types with different data types and later expand them using internal aliases, as shown below.

While using ROW constructors in recent migrations, I’ve encountered multiple usage patterns and scenarios. Let’s explore some of them in a blog post about uncovering the uses of ROW constructors in PostgreSQL.

We will include the following table as a sample in the blog.

postgres=# create table testrow (col1 integer, col2 integer generated by default as identity);
CREATE TABLE
postgres=# \d testrow
                           Table "public.testrow"
 Column |  Type   | Collation | Nullable |             Default              
--------+---------+-----------+----------+----------------------------------
 col1   | integer |           |          | 
 col2   | integer |           | not null | generated by default as identity



1. Auto aliasing on anonymous row type expansion.

Whenever we build a ROW type using a constructor, PostgreSQL maps it to a pseudo-RECORD type. If we later expand this ROW type, the internal field names will be assigned aliases like f1, f2, f3…fn. This is because ROW constructors don’t preserve the original table column names or any custom aliases.

When expanding a ROW constructor, regardless of whether it’s initialized using table columns or custom types, you’ll encounter the

[...]

Installing PostgreSQL on SLES 15 just got easier – and better!
Posted by Devrim GÜNDÜZ in EDB on 2024-02-27 at 13:30

The PostgreSQL RPM repository for SuSE Enterprise Linux
has supported SLES for quite some time. We followed the usual conventions and used zypper addrepo … to add the repositories.

Continue reading "Installing PostgreSQL on SLES 15 just got easier – and better!"

The default value of fdw_tuple_cost was updated to 0.2. What does that mean?
Posted by Umair Shahid in Stormatics on 2024-02-27 at 13:14

This blog post explores the recent change to the fdw_tuple_cost parameter in PostgreSQL, examining the problem it addresses and the reasoning behind the new default value (0.2).

The post The default value of fdw_tuple_cost was updated to 0.2. What does that mean? appeared first on Stormatics.

Understanding Indexes in pgvector
Posted by semab tariq in Stormatics on 2024-02-26 at 13:26

Explore how pgvector's indexes work, choose the right one for your needs, and find the best option for your critical data.

The post Understanding Indexes in pgvector appeared first on Stormatics.

TOAST and its influences on parallelism in PostgreSQL
Posted by Anthony Sotolongo León in OnGres on 2024-02-26 at 10:00

Introduction

Since the PostgreSQL 9.6 version was released, the feature related to query parallelism has appeared and has become a good option to improve query performance. Since then, the evolution of parallelism has been growing, resulting in better performance in the database. To manage the behavior of parallelism, there are some parameters that you can tune, for example:

  • max_parallel_workers_per_gather: The number of parallel workers to execute a query activity in parallel (default 2), these parallel workers are taken from the pool of processes defined by max_worker_processes, limited by max_parallel_workers, worth mentioning there is one special worker named: the leader worker, who coordinates and gathers the result of the scan from each of the parallel workers. This leader worker can or can not participate in scanning and will depend on its load in the coordination activities. Also, it is possible to control the involvement of the leader worker in the scanning using the parameter parallel_leader_participation.
  • min_parallel_table_scan_size: The minimum amount of table data(size) for a parallel scan to be considered (default 8MB)
  • min_parallel_index_scan_size: the minimum amount of index data(size) for a parallel scan to be considered (default 512kB)
  • parallel_setup_cost : the cost of starting up worker processes for parallel query (default 1000)
  • parallel_tuple_cost: the cost of passing each tuple (row) from worker to leader backend (default 0.1)
  • parallel_workers: A storage parameter for tables, that allows change the behavior of number of workers to execute a query activity in parallel, similar to max_parallel_workers_per_gather, but only for a specific table; ALTER TABLE tabname SET (parallel_workers = N);

According to the resources of your server and the workload of your database, you can tune these parameters to take advantage of parallelism. The min_parallel_table_scan_size affects the decision of Postgres code about how many workers

[...]

100x Faster Query in Aurora Postgres with a lower random_page_cost
Posted by Shayon Mukherjee on 2024-02-24 at 06:43
Recently I have been working with some queries in Postgres where I noticed either it has decided not to use an index and perform a sequential scan, or it decided to use an alternative index over a composite partial index. This was quite puzzling, especially when you know there are indexes in the system that can perform these queries faster. So what gives? After some research, I stumbled upon random_page_cost (ref).

pgagroal 1.6.0 has been released
Posted by Luca Ferrari on 2024-02-24 at 00:00

pgagroal, the fast connection pooler for PostgreSQL, has reached a new stable release!

pgagroal 1.6.0 has been released

A couple of days ago, pgagroal version 1.6.0 has been released.

This new version includes a lot of new features and small improvements that make pgagroal much more user-friendly and ease to adopt as a conenction pooler. The main contribution, from yours truly, has been command line refactoring and JSON support. Now the command line supports commands and subcommands, like for example conf get and conf set, and a more consistent set of commands. The JSON command output allows for an ease automation and a stable command output, so to ease the adoption in different scenarios.

But there’s more: a lot of other tickets have been solved during this release, and there is now support fo Mac OSX. Moreover, it is now possible to retrieve and set configuration values at run-time, thus without the need to manually editing the configuration file and reloading the daemon.

There is an initial exeperimental support for client certificates, and now it is possible to determine how long a connection must live.

A better handling of the configuration files, hence a better detection and reporting of misconfiguration, as well as a better error messaging system, completes the release.

The list of contributors is also expanding, and this is good and exciting!

Give pgagroal a try, you will be amazed by the capabilities of this connection pooler!

Multiple Client Certificate Selection – a Simple POC
Posted by cary huang in Highgo Software on 2024-02-23 at 23:13

Introduction

I recently came across this email thread discussion from several years ago, which discussed ways to enable a client to choose from a list of client certificates to send to the server. The benefit is obvious; when a client has to communicate with different PostgreSQL servers with different TLS settings and trust structure, the feature can help reduce the application’s effort to figure out what certificate to send to what server. In other words, the application does not need to have this selection logic because the libpq library is able to choose the right one to send.

OpenSSL provides PostgreSQL with TLS and other cryptographic capabilities and it has also evolved significantly over the past years. Now, it has much more API support that enabled applications to make more informed TLS decisions. For example, choosing the most appropriate client certificate to send to the server during handshake. Today, in this post, I will briefly explain how certificate chain of trust works and share the “multiple client certificate selection” patch if one is intereted.

TLS ? Certificate ? Trust Chain? What?

TLS (Transport Layer Security)

formerly called SSL (Secured Socket Layer). Now the two terms are used interchangeably. This is a protocol for encrypting and protecting network communications. It has 2 major responsibilities:

  • Communication encryption / decryption + integrity check
  • Use certificates to verify the identities of both parties.

Certificate

Also called X509 v3 Certificate, or simply X509 Certificate. It is a data structure that contains validity, issuer, extension, purpose and other custom information. Primarily used to represent an entity’s identity and ensure trust. It has three basic types:

  • Certificate Authority – Also called CA Certificate or root CA. (IdenTrust, DigiCert, GlobalSign, Let’s Encrypt, etc are common CAs. It can also be self-signed for test purposes).
  • Intermediate CA certificate.
  • Entity Certificate – The actual certificate
[...]

Changes to PGSQL Phriday Blogging Events
Posted by Ryan Booz on 2024-02-23 at 21:54
It’s been great to see the interest and participation of the monthly PGSQL Phriday blogging event grow over the last 15 months. And like any new venture, it takes time to see and understand the best way to organize it and what the best parameters are to encourage more participation from month to month. With ... Read more

The Rest is History: Investigations of WAL History Files
Posted by Brian Pace in Crunchy Data on 2024-02-23 at 13:00

PostgreSQL uses the concept of a timeline to identify a series of WAL records in space and time. Each timeline is identified by a number, a decimal in some places, hexadecimal in others. Each time a database is recovered using point in time recovery and sometimes during standby/replica promotion, a new timeline is generated.

A common mistake is to assume that a higher timeline number is synonymous with the most recent data. While the highest timeline points to the latest incarnation of the database, it doesn't guarantee that the database indeed holds the most useful data from an application standpoint. To discern the validity of this statement, a closer examination of the Write-Ahead Logging (WAL) history files is essential, unraveling the messages they convey.

In this discussion, we will explore a recovered database and trace the narrative embedded in the history files. By the conclusion, you will have gained a deeper insight into the functionality of these history files within Postgres, empowering you to address queries related to recovery processes and the database's historical journey (or may I call it the 'family tree').

Assessing current state

Let's begin by gaining insights into the current status of the database. The information obtained from the pg_controldata output indicates that the database is currently on timeline 11. Take note of the latest checkpoint Write-Ahead Logging (WAL) file, identified as '0000000B0000000100000039'. See my previous post on WAL file naming and numbering.

It's worth noting that timelines are sometimes expressed in decimal form, as in the case of 11, and at other times in hexadecimal form, such as '0000000B'. While this dual representation might be perplexing initially, familiarity with when and where these different forms are employed will contribute to a clearer understanding.

$ pg_controldata
...
pg_control last modified:             Tue 06 Feb 2024 03:10:53 PM EST
Latest checkpoint location:           1/39000060
Latest checkpoint's REDO location:   
[...]

Autoscaling in Action: Postgres Load Testing with pgbench
Posted by Raouf Chebri in Neon on 2024-02-23 at 09:25

Blog post cover

In this article, I’ll show Neon autoscaling in action by running a load test using one of Postgres’ most popular benchmarking tool, pgbench. The test simulates 30 clients running a heavy query.

While 30 doesn’t sound like a lot, the query involves a mathematical function with high computational overhead, which signals to the autoscaler-agent that it needs to allocate more resources to the VM.

We will not cover how autoscaling works, but for those interested in knowing the details, you can read more about how we implemented autoscaling in Neon.

For this load test, you will need:

  1. A Neon account
  2. pgbench

The load test

Ensuring your production database can perform under varying loads is crucial. That’s why we implemented autoscaling to Neon, a feature that dynamically adjusts resources allocated to a database in real-time, based on its current workload.

However, the effectiveness and efficiency of autoscaling are often taken for granted without thorough testing. To showcase autoscaling in action, we turn to Postgres and pgbench.

pgbench is a benchmarking tool included with Postgres, designed to evaluate the performance of a Postgres server. The tool simulates client load on the server and runs tests to measure how the server handles concurrent data requests.

pgbench is executed from the command line, and its usage can vary widely depending on the specific tests or benchmarks being run. Here is the command we will use in our test:

pgbench -f test.sql -c 30 -T 120 -P 1 <;CONNECTION_STRING>;

In this example, pgbench executes the query in test.sql. The parameter -c 30 specifies 30 client connections, and -T 120 runs the test for 120 seconds against your database. -P 1 specifies that pgbench should report the progress of the test every 1 second. The progress report typically includes the number of transactions completed so far and the number of transactions per second.

30 clients don’t seem like enough do stress a database. Well, it depends on t

[...]

Recent PGXN Improvements
Posted by David E. Wheeler on 2024-02-22 at 21:19

One of the perks of my new gig at Tembo is that I have more time to work on PGXN. In the last ten years I’ve had very little time to give, so things have stagnated. The API, for example, hasn’t seen a meaningful update since 2016!

But that’s all changed now, and every bit of the PGXN architecture has experienced a fair bit of TLC in the last few weeks. A quick review.

PGXN Manager

PGXN Manager provides user registration and extension release services. It maintains the root registry of the project, ensuring the consistency of download formatting and naming ($extension-$version.zip if you’re wondering). Last year I added a LISTEN/NOTIFY queue for publishing new releases to Twitter and Mastodon, replacing the old Twitter posting on upload. It has worked quite well (although Twitter killed the API so we no longer post there), but has sometimes disappeared for days or weeks at a time. I’ve futzed with it over the past year, but last weekend I think I finally got to the bottom of the problem.

As a result, the PGXN Mastodon account should say much more up-to-date than it sometimes has. I’ve also added additional logging capability, because sometimes the posts fail and I need better clarity into what failed and how so it, too, can be fixed. So messaging should continue to become more reliable. Oh, and testing and unstable releases are now prominently marked as such in the posts (or will be, next time someone uploads a non-stable release).

I also did a little work on the formatting of the How To doc, upgrading to MultiMarkdown 6 and removing some post-processing that appended unwanted backslashes to the ends of code lines.

PGXN API

PGXN API (docs) has been the least loved part of the architecture, but all that changed in the last couple weeks. I finally got over my trepidation and got it working where I could hack on it again. Turns out, the pending issues and to-dos — some dating back to 2011! — weren’t so difficult to take on as I had feared. In the last few weeks, API has finally come uns

[...]

The History and Future of Extension Versioning
Posted by David Wheeler in Tembo on 2024-02-22 at 19:33

Every software distribution system deals with versioning. Early in the design of PGXN, I decided to require semantic versions (SemVer), a clearly-defined and widely-adopted version standard, even in its pre-1.0 specification. I implemented the semver data type that would properly sort semantic versions, later ported to C by Sam Vilain and eventually updated to semver 2.0.0.

As I’ve been thinking through the jobs and tools for the Postgres extension ecosystem, I wanted to revisit this decision, the context in which it was made, and survey the field for other options. Maybe a “PGXN v2” should do something different?

But first that context, starting with Postgres itself.

PostgreSQL Extension Version Standard

From the introduction extensions in PostgreSQL 9.1, the project side-stepped the need for version standardization and enforcement by requiring extension authors to adopt a file naming convention, instead. For example, an extension named “pair” must have a file with its name, two dashes, then the version as listed in its control file, like so:

pair--1.1.sql

As long as the file name is correct and the version part byte-compatible with the control file entry, CREATE EXTENSION will find it. To upgrade an extension the author must provide a second file with the extension name, the old version, and the new version, all delimited by double dashes. For example, to upgrade our “pair” extension to version 1.2, the author supply all the SQL commands necessary to upgrade it in this file:

pair--1.1--1.2.sql

This pattern avoids the whole question of version standards, ordering for upgrades or downgrades, and all the rest: extension authors have full responsibility to name their files correctly.

PGXN Versions

SemVer simplified a number of issues for PGXN in ways that the PostgreSQL extension versioning did not (without having to re-implement the core’s file naming code). PGXN

[...]

PostgreSQL Backup and Recovery Management using Barman
Posted by muhammad ali in Stormatics on 2024-02-22 at 17:47

Barman stands as a widely used open-source tool dedicated to managing backup and disaster recovery operations for PostgreSQL databases.

The post PostgreSQL Backup and Recovery Management using Barman appeared first on Stormatics.

Quick Benchmark: PostgreSQL 2024Q1 Release Performance Improvements
Posted by Michael Banck in credativ on 2024-02-22 at 15:55

The PostgreSQL 2024Q1 back-branch releases 16.2, 15.6, 14.11, 13.14 and 12.18 on February 8th 2024. Besides fixing a security issue (CVE-2024-0985) and the usual bugs, they are somewhat unique in that they address two performance problems by backporting fixes already introduced into the master branch before. In this blog post, we describe two quick benchmarks that show how the new point releases have improved. The benchmarks were done on a ThinkPad T14s Gen 3 which has a Intel i7-1280P CPU with 20 cores and 32 GB of RAM.

Scalability Improvements During Heavy Contention

The performance improvements in the 2024Q1 point releases concerns locking scalability improvements at high client counts, i.e., when the system is under heavy contention. Benchmarks had shown that the performance was getting worse dramatically for a pgbench run with more than 128 clients. The original commit to master (which subsequently was released with version 16) is from November 2022. It got introduced into the back-branches now as version 16 has seen some testing and the results were promising.

The benchmark we used is adapted from this post by the patch author and consists of a tight pgbench run simply executing SELECT txid_current() for five seconds each at increasing client count and measuring the transactions per second:

$ cat /tmp/txid.sql
SELECT txid_current();
$ for c in 1 2 4 8 16 32 64 96 128 192 256 384 512 768 1024 1536;
> do echo -n "$c ";pgbench -n -M prepared -f /tmp/txid.sql -c$c -j$c -T5 2>&1|grep '^tps'|awk '{print $3}';
> done

The following graph shows the average transactions per second (tps) over 3 runs with increasing client count (1-1536 clients), using the Debian 12 packages for version 15, comparing the 2023Q4 release (15.5, package postgresql-15_15.5-0+deb12u1) with the 2024Q1 release (15.6, package postgresql-15_15.6-0+deb12u1):

The tps numbers are basically the same up to 128 clients, whereas afterwards the 15.5 transaction counts drops from the peak of 650k 10-fold to 65k. The new 15.6

[...]

Point In Time Recovery Under the Hood in Serverless Postgres
Posted by Raouf Chebri in Neon on 2024-02-22 at 12:44

Imagine working on a crucial project when suddenly, due to an unexpected event, you lose significant chunks of your database. Whether it’s a human error, a malicious attack, or a software bug, data loss is a nightmare scenario. But fear not! We recently added support for Point-In-Time Restore (PITR) to Neon, so you can turn back the clock to a happier moment before things went south.

In the video below and in the PITR announcement article, my friend Evan shows you can recover your data in a few clicks. He also uses Time Travel Assist to observe the state of the database at a given timestamp to confidently and safely run the restore process.

How is this possible? This article is for those interested in understanding how PITR works under the hood in Neon. To better explain this, we will:

  1. Cover the basics of PITR in Postgres
  2. Explore the underlying infrastructure that allows for PITR in Neon.

We’ll ensure by the end of this post that you’re always prepared for disaster strikes.

Understanding the basics of Point In-Time Recovery in Postgres

PITR in Postgres is made possible using two key components:

  1. Write-Ahead Logging : Postgres uses Write-Ahead Logging (WAL) to record all changes made to the database. Think of WAL as the database’s diary, keeping track of every detail of its day-to-day activities.
  2. Base backups : Base backups are snapshots of your database at a particular moment in time.

With these two elements combined, you define a strategy to restore your database to any point after the base backup was taken, effectively traveling through your database’s timeline. However, you’d need to do some groundwork, which consists of the following:

  1. Setting up WAL archiving: By defining an archive_command and setting archive_mode to on in your postgresql.conf.
  2. Creating base backups: You can use the pg_basebackup to create daily backups.

If, for any reason, you need to restore your database, you need to recover the latest backup a

[...]

The Jobs to be Done by the Ideal Postgres Extension Ecosystem
Posted by David Wheeler in Tembo on 2024-02-21 at 17:00

The past year has seen a surge in interest in Postgres extension distribution. A number of people have noted in particular the challenges in finding and installing extensions.

PGXN, released way back in 2011, aimed to be the canonical registry for the community, but currently indexes only the 350 distributions that developers have made the effort to publish — perhaps a third of all known extensions. Furthermore, PGXN distributes source code packages, requiring developers to download, compile, install, and test them.

As a result of these challenges, a number of new entrants have emerged in recent months with a focus on discovery and ease of installation. These include dbdev, pgxman, pgpm, and our own trunk. However, they too exhibit limitations, such as minimal OS and platform availability and, notably, the manual process to add extensions, further constraining inclusion. None has even reached the number provided by PGXN.

These challenges and the interest and energy put into exploring new solutions make clear that the time has come to revisit the whole idea of the PostgreSQL extension ecosystem: to work though the jobs to be done, specify the tools to do those jobs, and outline a plan for the broader Postgres community to design and build them.

Future posts will dream up the tools and make the plan; today we begin with the jobs.

🎬 Let’s get started.


Jobs to be Done

Theodore Levitt cites a famous adage:

“Last year 1 million quarter-inch drills were sold,” Leo McGivena once said, “not because people wanted quarter-inch drills but because they wanted quarter-inch holes.”

People don’t buy products; they buy the expectation of benefits.

Today the most cited descriptor underpinning Jobs to be Done Theory, thanks to Clayton Christensen, we bear it in mind to think through the “jobs” of an idealized Postgres extension ecosystem — without reference to existing Postgres solutions. However, we cite examples for the jobs from other projects and communities, both to cla

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.