The soucre code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.
The extension is also available on PGXN.
As PostgreSQL 12 is not supported anymore, it went out of support on November 21, 2024, it has been removed from the supported versions of pgsql_tweaks.
The extension has now a new function to create a Markdown documentation for the giving schema name: get_markdown_doku_by_schema.
So far the following objects are supported:
For each object the basic informations and comments are included in the documentation.
This example is generating the documentation as Markdown for the current database:
WITH res AS
(
SELECT array_agg (stats.get_markdown_doku_by_schema(schema_name)) AS markdown
FROM information_schema.schemata
WHERE information_schema.schemata.catalog_name = (current_database())::information_schema.sql_identifier
-- Exclude some system schemas
AND schema_name NOT IN
(
'information_schema',
'pg_catalog',
'pg_toast'
)
-- Exclude empty results
AND COALESCE (stats.get_markdown_doku_by_schema(schema_name), '') <> ''
)
-- CHR(13) is a line break and is used here to have two
-- line breaks between every schema result
SELECT array_to_string(markdown, CHR(13) || CHR(13)) AS markdown_for_all_schemas
FROM res
;
Have you ever experienced your database slowing down as the amount of data increases? If so, one important factor to consider is tuning PostgreSQL parameters to match your specific workload and requirements.
PostgreSQL has many parameters because it is designed to be highly flexible and customizable to meet a wide range of use cases and workloads. Each parameter allows you to fine-tune different aspects of the database, such as memory management, query optimization, connection handling, and more. This flexibility helps database administrators to optimize performance based on hardware resources, workload requirements, and specific business needs.
In this blog, I will cover some of the important PostgreSQL parameters, explain their role, and provide recommended values to help you fine-tune your database for better performance and scalability.
Memory-Related Parameters
Memory-related parameters in PostgreSQL control how the database allocates and manages memory. Tuning these settings is important for improving query performance and preventing resource bottlenecks.
Name: work_mem
Description: Sets the maximum amount of memory used by internal operations like sorts and hashes before writing to disk. Increasing it can improve performance for complex queries
Default: 4MB
Recommended: Typically, setting work_mem to 1-2% of the total system's available memory is recommended, i.e., if the total system memory is 256 GB, assign 3 to 5 GB for work_mem.
Note: This may lead to higher memory usage for operations that involve sorting.
Name: shared_buffers
Description: Determines the amount of memory allocated for caching database data.
Default: 128MB
Recommendation: Typically, setting shared_buffers to 25-40% of the total system memory is recommended, i.e., if the total system memory is 256 GB, assign 64-102 GB for shared_buffers.
Name: maintenance_work_mem
Description: Specifies the amount of memory used for maintenance operations like VACUUM, CREATE INDEX, and ALTER TABLE. Increa
PostgreSQL streaming replication is pretty fast. It is based on transaction log (WAL) and crash recovery mechanism. This work is something the primary must handle anyway for safe operation, so the overhead on the primary side is negligible. The only extra work is pushing WAL out onto the network in addition to disk.
On the replica side, the apply process is also quite lightweight, because the transaction log contains physical layer changes. When the database system receives an update, tasks such as finding the row, locating space for the new version, finding the correct place where to insert new references into the indexes,and splitting pages if necessary will all be done by the primary. This work is encoded in the transaction log stream as modifications to specific pages. The replay process only has to find the page and replay the modification, therefore no “thinking” is necessary.
However, there is one problem here. On the primary side, transaction log sequencing is highly optimized. Each database backend prepares their transaction log record in local memory, then reserves space in WAL for it, and then proceeds to copy it to a shared memory buffer. The space reservation is the only thing that happens sequentially, everything else happens concurrently. And the sequential part is only incrementing a location pointer under a spin lock- about as fast as it can be. Therefore, on the primary we have lots and lots of backends running through write transactions, encoding the work done as transaction log, and inserting it in parallel. On the replica side, there is only one process that is applying all of this work. The job it is doing is simpler, but at some scale, the larger number of workers will be able to overwhelm it.
So where is the point where replication will fall behind? As with everything in databases, the answer is “it depends”. Different workloads will have different characteristics. If the working set of modified pages does not fit into cache, the replay process w
[...]Statistically, September CommitFests feature the fewest commits. Apparently, the version 18 CommitFest is an outlier. There are many accepted patches and many interesting new features to talk about.
If you missed the July CommitFest, get up to speed here: 2024-07.
...
PostgreSQL Europe will stop posting on Twitter/X. This includes our main accounts, and all conference accounts we operate or help to organize. . Recent changes to the platform's policies steadily transform Twitter/X into a space which we believe spreads a lot of hate and is no longer aligned with our community spirit and values.
Please connect with us on our various Social Media channels:
And of course post on our pgeu-general mailing list.
The first round of minor versions in 2025 was released this week: PostgreSQL 17.3, 16.7, 15.11, 14.16, and 13.19 Released! - but wait! Out-of-cycle release scheduled for February 20, 2025 .
PostgreSQL 18 changes Add cost-based vacuum delay time tracking added to pg_stat_progress_analyze and pg_stat_progress_vacuum and associated log output Eagerly scan all-visible pages to amortize aggressive vacuum logical decoding of already-aborted transactions will be skipped pgcrypto : support for CFB mode in AES encryption added PostgreSQL 18 articles Waiting for PostgreSQL 18 – Add cost-based vacuum delay time to progress views (2025-02-13) - Hubert 'depesz' Lubaczewski PostgreSQL 18: Virtual generated columns (2025-02-10) - Daniel Westermann / dbi servicesI’ve been putting together a new PostgreSQL session called “Performance Monitoring for the Absolute Beginner.” There are several ways to get an understanding of how well your queries are running in PostgreSQL, but, frankly, all of them are a bit of a pain to someone coming from the land of Extended Events (ah, my one […]
The post Free PostgreSQL Performance Monitoring with pgNow appeared first on Grant Fritchey.
Have you ever faced timeline issues during PostgreSQL recovery? It can be frustrating to complete a restore only to discover during recovery that the timeline is invalid, which often results in an error message like this:
FATAL: requested timeline 2 is not a child of this server's history DETAIL: Latest checkpoint is at 0/7000028 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/600AA20.
While understanding the importance of PostgreSQL timelines is crucial, this topic can be confusing for those unfamiliar with it.
In this post, we’ll explore why this happens, what you can do about it, and how to detect it earlier.
Let’s start with a simple scenario: a full backup followed by some generated WAL archives.
$ pgbackrest info stanza: ro9pg status: ok cipher: aes-256-cbc db (current) wal archive min/max (17): 00000001000000000000000E/000000010000000000000031 full backup: 20250206-102246F timestamp start/stop: 2025-02-06 10:22:46+00 / 2025-02-06 10:22:49+00 wal start/stop: 00000001000000000000000E / 00000001000000000000000E database size: 22.2MB, database backup size: 22.2MB repo1: backup size: 2.7MB
Now, someone unexpectedly promotes a standby or performs a test restore to validate the backup but forgets to disable WAL archiving on the temporary cluster.
$ pgbackrest info ... wal archive min/max (17): 000000010000000000000003/00000002000000000000003A
$ pgbackrest repo-ls archive/ro9pg/17-1 0000000100000000 00000002.history 0000000200000000 $ pgbackrest repo-ls archive/ro9pg/17-1/0000000100000000 ... 000000010000000000000031-f35618b3a1176b39059b5aac8fd17f8ecd63d3be.zst 000000010000000000000032.partial-826c8a36ae1a5d4a624720382ece08f2b9f4bf29.zst $ pgbackrest repo-ls archive/ro9pg/17-1/0000000200000000 000000020000000000000032-e22f694a0bd949cda0be235f09b5620351eb3f4c.zst 000000020000000000000033-81d92d49082a857880affca6a084f0e6d9d3bd26.zst ... $ pgbackrest repo-get archive/ro9pg/17-1/00000002.history 1 0/32899B98 no recovery target spec[...]
Finally I found time to get a grip on the issues I had with gohugo and get my blog operational again.
I’m taking the occasion to write about a project I care a lot as it’s strictly related with my hometown, the upcoming PGDay Napoli.
Every time I host a Postgres Meetup, I am thankful to everyone who comes. But today, I am especially thankful because so many people showed up despite the snowstorm! Staying after work to attend a meetup in these weather conditions demonstrates true dedication!
Many thanks to Steve Zelaznik for his wonderful presentations (recording will be available!), and thanks to all participants for listening, asking insightful questions, and a lively discussion afterward!
I have seen the error from the title reported often enough. If you are not familiar with the internals of PostgreSQL, the message will confuse you: what are xmin
and relfrozenxid
? Why is it a problem if one is before the other? So I think that it is worth the effort to write an article about the problem and what the error means. I'll also show you how to cope with the situation and fix the error.
xmin
and relfrozenxid
?
xmin
and xmax
Each PostgreSQL table entry (“tuple”) has the system columns xmin
and xmax
. They contain the transaction ID of the transactions that created and invalidated (updated or deleted) the version of the row that the tuple represents. Each SQL statement has a snapshot that determines which transaction IDs it can see. If a statement can see the tuple's xmin
, and the xmax
is invalid (has a value of 0), invisible or belongs to a transaction that has not committed, that version of the row is visible to the statement. (For additional details about xmax
, see this article.)
One of the problems with PostgreSQL's multi-versioning architecture is that transaction IDs are generated from a 4-byte unsigned integer counter. Once the counter reaches its maximum value, it will “wrap around” to 3 (the values from 0 to 2 have special meanings). As a consequence, transaction IDs change their meaning as time goes by: a transaction ID that belonged to a committed transaction may now belong to a future transaction or one that is rolled back. That would mean that visible rows could suddenly become invisible, thus causing data corruption. If you want to see that at play, you should read my article on transaction ID wraparound.
To prevent this kind of data corruption from happening, VACUUM
(typically triggered by autovacuum) freezes old, visible table rows: it sets a flag on the row that indicates that the reader should ignore xmin
and xmax
. That flag marks the row as unconditionally visible. Once PostgreSQL
Anecdotally, this might be the most requested feature in SQL: Allow some trailing commas.
The classic example is
SELECT a,
b,
c, -- here
FROM ...
Another one is
CREATE TABLE tab1 (
a int,
b int,
c int, -- here
)
There might be a few other popular ones. (Please send feedback.)
How could we support that? And by “we”, I primarily mean, either in PostgreSQL, which I work on, or in the SQL standard, which I work on. And ultimately in other SQL implementations, which may or may not follow either of these two sources.
Implementing the above two cases in PostgreSQL is easy. Done?!?
But there are loads more places in SQL with comma-separated lists, for example, array constructors, row constructors, function calls, as well as in various commands, such as function definitions, type definitions, COPY
, CREATE PUBLICATION
, many others. What to do about this? Is there a line to draw somewhere?
I can see a few possible approaches:
We just support a few of the most requested cases. Over time, we can add a few more if people request it.
We support most cases, except the ones that are too complicated to implement or cause grammar conflicts.
We rigorously support trailing commas everywhere commas are used and maintain this for future additions.
These are all problematic, in my opinion. If we do option 1, then how do we determine what is popular? And if we change it over time, then there will be a mix of versions that support different things, and it will be very confusing. Option 2 is weird, how do you determine the cutoff? Option 3 would do the job, but it would obviously be a lot of work. And there might be some cases where it’s impossible, and it would have to degrade into option 2.
In any case, it would also be nice to get this into the SQL standard. Then we can aim for some consistency across implementations in the long run.
There, we have the same questions, but we need to be even
[...]After reading Paul's teasing article on Accessing Large Language Models with AI and testing out his code, I decided I should probably stop fighting this AI smothering. I still have a distrust of AI services but downloading an AI model and using on my own local desktop or server is nice and I can break bread with that. One use I tried is using AI to generate fake data and it did a pretty decent job. I also learned a couple of things.
Continue reading "Learning PostgreSQL from AI and JSON exploration"For PostGIS Day this year I researched a little into one of my favourite topics, the history of relational databases. I feel like in general we do not pay a lot of attention to history in software development. To quote Yoda, “All his life has he looked away… to the future, to the horizon. Never his mind on where he was. Hmm? What he was doing.”
Anyways, this year I took on the topic of the early history of spatial databases in particular. There was a lot going on in the ’90s in the field, and in many ways PostGIS was a late entrant, even though it gobbled up a lot of the user base eventually.
Postgres' ability to execute queries in parallel is a powerful feature that can significantly improve query performance, especially on large datasets. However, like all resources, parallel workers are finite. When there aren't enough available workers, Postgres may downgrade a parallel query to a serial (non-parallel) execution. This sounds reasonable unless the performance of the downgraded query is well beyond the required response times needed by the application.
While helping our clients with Oracle to PostgreSQL migrations, query downgrading is a common challenge. In Oracle 11.2, Oracle introduced a feature called "Parallel Statement Queuing." This feature prevents downgrades by queuing parallel queries until enough parallel PX servers are available to handle the request.
This post explores how parallel queries work, what triggers downgrades, and how you can monitor and optimize parallel worker usage to prevent performance bottlenecks. We'll also explore a sample solution that mirrors Oracle's Parallel Statement Queuing feature.
When PostgreSQL executes a query in parallel, it divides the work of one or more query nodes (tasks) across multiple processes called parallel workers. These workers cooperate to process parts of the data simultaneously, reducing query time for operations like scans, joins, and aggregations. The database allocates parallel workers up to the maximum defined by the max_parallel_workers configuration setting. If parallel workers cannot be allocated, the query is downgraded to serial execution.
There are a few key reasons why a parallel query may be downgraded:
Organizing Committee FOSDEM PGDay and the PostgreSQL dev room at FOSDEM:
Talk Selection Committee FOSDEM PGDay and the PostgreSQL dev room at FOSDEM:
Other volunteers for FOSDEM PGDay and the PostgreSQL dev room:
Talk recordings for the PostgreSQL dev room: https://fosdem.org/2025/schedule/track/postgresql/
Talking Postgres podcast Feb 5th
Talking Postgres podcast Jan 15th
Talking Postgres podcast Dec 4
Talking Postgres podcast Nov 13
The people that made Prague PostgreSQL Developer Day (P2D2) a reality: Organizers & CfP Committee:
Speakers and Training Lecturers:
Database VIEWs are powerful tools that often don't get the attention they deserve when building database-driven applications. They make our database work easier in several ways:
Let's see how this works with a practical example. Imagine we want to work with active users - users who have used the application within the last 7 days. Instead of writing this condition in every query, we can define a view:
CREATE VIEW active_users AS
SELECT
*
FROM users;
WHERE
last_login > current_date - INTERVAL '7 days';
Now we can easily use this view whenever we need active users. For example, if we want to find active users from Germany:
SELECT
user_id
FROM active_users
WHERE
country = 'Germany';
While this simple example shows how views can make developers' lives easier by organizing and reusing common logic, there's more to the story. In this article, we'll explore something even more interesting: how PostgreSQL can optimize these views through a process called "inlining" - making them not just convenient, but fast too.
When you use a view, it acts like a building block in SQL queries. Taking our previous example, PostgreSQL effectively transforms the query by replacing the view with its underlying subquery.
SELECT
user_id
FROM (
SELECT
*
FROM users
WHERE
last_login > current_date - INTERVAL '7 days'
) active users
WHERE
country = 'Germany';
While we write the query using active_users
VIEW, PostgreSQL query planner will see it as an opportunity to optimize it further. Instead of treating the sub-query as separate step (and retrieving large subset of the users), it effectively transforms the query and inlines the view into the query itself. Behind the scenes, PostgreSQL will execute the query similar to:
SELECT
[...]
With the postgis_raster
extension, it is possible to access gigabytes of raster data from the cloud, without ever downloading the data.
How? The venerable postgis_raster
extension (released 13 years ago) already has the critical core support built-in!
Rasters can be stored inside the database, or outside the database, on a local file system or anywhere it can be accessed by the underlying GDAL raster support library. The storage options include S3, Azure, Google, Alibaba, and any HTTP server that supports RANGE requests.
As long as the rasters are in the cloud optimized GeoTIFF (aka "COG") format, the network access to the data will be optimized and provide access performance limited mostly by the speed of connection between your database server and the cloud storage.
Set up a database named raster
with the postgis
and postgis_raster
extensions.
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_raster;
ALTER DATABASE raster
SET postgis.gdal_enabled_drivers TO 'GTiff';
ALTER DATABASE raster
SET postgis.enable_outdb_rasters TO true;
COG is still a new format for public agencies, so finding a large public example can be tricky. Here is a 56GB COG of medium resolution (30m) elevation data for Canada. Don't try and download it, it's 56GB!
You can see some metadata about the file using the gdalinfo
utility to read the headers.
gdalinfo /vsicurl/https://datacube-prod-data-public.s3.amazonaws.com/store/elevation/mrdem/mrdem-30/mrdem-30-dsm.tif
Note that we prefix the URL to the image with /viscurl/
to tell GDAL to use virtual file system access rather than direct download.
There is a lot of metadata!
Driver: GTiff/GeoTIFF
Files: /vsicurl/https://datacube-prod-data-public.s3.amazonaws.com/store/elevation/mrdem/mrdem-30/mrdem-30-dsm.tif
Size is 183687, 159655
Coordinate System is:
PROJCRS["NAD83(CSRS) / Canada Atlas Lambert",
BASEGEOGCRS["NAD83(CSRS)",
DATUM["NAD83 C
FOSDEM has long been a key event for the PostgreSQL community, and 2025 was no exception. Every year, PostgreSQL developers, contributors, and users worldwide gather in Brussels for a week packed with talks, meetings, and discussions. Our community has a strong tradition at FOSDEM, with a dedicated developer room, a booth, and numerous side events
One of the most recognizable FOSDEM traditions is FOSDEM Weather — a perfect mix of rain, cold, and clouds that always greets us in Brussels. Over the years, we even started selling PostgreSQL-branded umbrellas at our booth, and you could track the weather by watching the sales spike! But this year, for the first time in a while, the weather was sunny and perfect — so we joked that the weather was trying to lower our sales. 😄
FOSDEM week is always busy for the PostgreSQL community, with multiple events happening alongside the main conference:
Held earlier in the week, the Prague PostgreSQL Developer Day (P2D2) is a major PostgreSQL event that gathers experts for in-depth discussions. For a detailed experience of the event, check out Ants Aasma’s blog post.
On Thursday, January 30th, PostgreSQL core developers and contributors held a private developer meeting to discuss ongoing projects, significant challenges, and future PostgreSQL development. More details can be found on the official event page.
On Friday, January 31st, we had FOSDEM PGDay, a full-day PostgreSQL conference. The schedule featured talks on a wide range of PostgreSQL topics, from performance tuning to backups and new features. The complete schedule is available here.
The PostgreSQL Devroom took place on Sunday, February 2nd, featuring sessions on database performance, replication, analytics, AI/ML, and more. This year, we had great discussions about PostgreSQL internals, best practices, and real-world use cases. You can fin
[...]The Citus 13.0 release is out and includes PostgreSQL 17.2 support! We know you’ve been waiting, and we’ve been hard at work adding features we believe will take your experience to the next level, focusing on bringing the Postgres 17 exciting improvements to you at distributed scale.
The Citus database is an open-source extension of Postgres that brings the power of Postgres to any scale, from a single node to a distributed database cluster. Since Citus is an extension, using Citus means you're also using Postgres, giving you direct access to the Postgres features. And the latest of such features came with Postgres 17 release! In addition, Citus 13 will be made available on the elastic clusters (preview) feature on Azure Database for PostgreSQL - Flexible Server, along with PostgreSQL 17 support, in the near future.
PostgreSQL 17 highlights include performance improvements in query execution for indexes, a revamped memory management system for vacuum, new monitoring and analysis features, expanded functionality for managing data in partitions, optimizer improvements, and enhancements for high-concurrency workloads. PostgreSQL 17 also expands on SQL syntax that benefits both new workloads and mission-critical systems, such as the addition of the SQL/JSON JSON_TABLE()
command for developers, and the expansion of the MERGE
command. For those of you who are interested in upgrading to Postgres 17 and scaling these new features of Postgres: you can upgrade to Citus 13.0!
Along with Postgres 17 support, Citus 13.0 also fixes important bugs, and we are happy to say that we had many community contributions here as well. These bugfixes focus on data integrity and correctness, crash and fault tolerance, and cluster management, all of which are critical for ensuring reliable operations and user confidence in a distributed PostgreSQL environment.
Let's take a closer look at what's new in Citus 13.0:
[...]This is translation of my report "Debugging PostgreSQL planner" from PGBootCamp 2024 conference.
You can find repository with source code and another staff here.
In this post we will look at how the PostgreSQL planner works, but on code level (functions and data structures) and how to hack on it's planner.
Node
and it's children
After some theory we will implement and add a little feature into the planner.
Work will be done in folder link a gave earlier (it's cwd)
If you want to reproduce some parts of this post, then you need to setup repository.
All you need to do is to run init.sh
. This script:
.vscode
folder
For source code downloading you also need to have
wget
orcurl
andtar
to unzip.
If they are missing install them manually or download archive manually using this link and store it in same directory asinit.sh
script.
For building and debugging PostgreSQL you also need to have these libraries/executables installed:
libreadline
bison
flex
make
gcc
gdb
(or another debugger)
CPAN
(for PERL)
You can install them in such way:
# Debian based
sudo apt update
sudo apt install build-essential gdb bison flex libreadline-dev git
# RPM based
sudo yum update
sudo yum install gcc gdb bison flex make readline-devel perl-CPAN git
So, whole setup pipeline is this:
# Clone repository and go to directory with meetup files
git clone https://github.com/TantorLabs/meetups
cd "meetups/2024-09-17_Kazan/Sergey Solovev - Debugging PostgreSQL planner"
# Run i
For several years there is a PGDay in Brussels on the Friday before FOSDEM organized by PostgreSQL Europe.
The talks that I attended have been good. Only that it the speakers have not been very diverse.
That was alike the audience. I have only counted four women, and three of them are members of the PostgreSQL Europe Diversity Committee.
And all three members of the Diversity Committee gave a lightning talk.
For the first time ever I gave a lightning talk at a conference. The idea is based on the Books paragraph in a blog post I wrote about sources where to learn PostgreSQL.
The idea came from Claire Giordano and Karen Jex enouraged me, to submit it.
A big thank you goes to Floor Drees for bringing lightning talks to Brussels.
https://fosdem.org/2025/schedule/track/postgresql/ I extended the book list on the blog with some additional infos, you can find the slides on GitLab.
FOSDEM is the biggest open source conference in Europe scheduled on the first weekend of February each year. It takes place in Brussels at the UNIVERSITÉ LIBRE DE BRUXELLES (ULB).
This year was the 25th birthhttps://fosdem.org/2025/schedule/track/postgresql/day of the conference, and it had overall more than 700 talks on two days.
For several years now there is a so called devroom, where PostgreSQL Europe organizes talks and manages the audience. There is a video host, a room host, several room bouncer aka Elephant Herders.
In addition a stand has to be maintained where people ask questions and one could by PostgreSQL merchandise stuff.
Everything has to be taken to Brussels, that has been handled by Andreas Scherbaum and Ilya Kosmodemiansky. They both came by car packed with shirts, hoodies, pins, stickers, small plusch Sloniks.
Not to forget that there is a Sloinik costume that has to be worn by people. And as one doesn’t see a lot through from inside the costume, another person is needed
[...]Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.