I’ve submitted a lot of talk proposals to a lot of Postgres conferences over the years. Some got accepted, many more were not. And I’ve been on the other side of this process too, as a member of the CfP committee responsible for selecting talks. So let me give you a couple suggestions on how to write a good talk proposal.
People tend to think that a SELECT-operation is always a read. This might look obvious and make sense but does it actually hold true? Is a SELECT-statement a read? Can we really determine from the statement what it does so that we can safely load balance? In fact, no. The fact that something seems to be a pure read does not make it a read - and that has huge implications which will be discussed in this short but
important post.
Often people try to build their own replication solution assumung that load balancing of queries can be simply handled by some kind of middleware. However, this is definitely doomed to fail. In short: Do NOT build software that magically decides in some middleware what to do.
Why is that? Consider the following:
SELECT * FROM tab;
But what if the following happens?
test=# CREATE TABLE a (id int);
CREATE TABLE
test=# CREATE FUNCTION x(int)
RETURNS int AS ' INSERT INTO a VALUES ($1) RETURNING *'
LANGUAGE 'sql';
CREATE FUNCTION
In this case I have defined a simple table and a function that just happens to insert some simple data. But what if we add the following view to the scenario?
test=# CREATE VIEW tab AS SELECT * FROM a, x(10) AS y;
CREATE VIEW
This will change things because we can access the view easily:
test=# SELECT * FROM tab;
id | y
----+---
(0 rows)
If we call it again we will already see the leftovers of the previous execution:
test=# SELECT * FROM tab;
id | y
----+----
10 | 10
(1 row)
From the application or middleware point of view we are still talking about a simple SELECT statement. What can possibly go wrong? Well, as you can see: A lot can go wrong because the string we send to the database does not carry all the information the middleware needs to make a decision whether we are talking about reading and writing.
Let us focus on what we want to solve here: We want to use our read-replicas which w
[...]
CloudNativePG 1.24 introduces a highly customisable approach to managing PostgreSQL synchronous replication through the new .spec.postgresql.synchronous
stanza. In this article, I’ll guide you through configuring synchronous replication within a single Kubernetes cluster and across multiple clusters. I’ll explore quorum-based and priority-based replication methods, highlighting their benefits and trade-offs. Additionally, I’ll explain how to adjust the synchronous_commit
setting to strike the right balance between data durability and performance.
In an increasingly globalized world, companies and organizations are leveraging distributed systems to handle massive amounts of data across geographically separated locations. Whether it is for ensuring business continuity, disaster recovery, or simply improving data access for users in different regions, replication of data between nodes situated in diverse geographical locations has become a critical aspect of modern database systems.
The post Understanding Factors Impacting Data Replication Latency in PostgreSQL Across Geographically Distributed Nodes appeared first on Stormatics.
Sarah Conway organized a great community outreach campaign for Timescale's State of PostgreSQL survey for 2024, promoted by community partners:
The focus of your backup strategy should not be backups. Instead, you should be focused on restoring databases, because, after all, that’s what matters. Simply having a backup is fine. However, if you can’t restore your data from that backup, it doesn’t matter. On top of this is the ability to ensure you can restore your databases to a point in time, not simply to the last full backup. That ability requires a different approach to how you perform your backups.
PostgreSQL has the capabilities to support backups as I already described in my first article on the topic. PostgreSQL also has the capability to restore to a point in time. However, that does require you to change the way you’re performing your backups. This article advances our understanding of how to better protect your PostgreSQL databases by expanding on the database backups and restores into a more full-blown disaster recovery process through point in time restores.
While the important part is the restore, in a classic chicken or egg conundrum, we can’t talk about restoring until we first have a backup, so I’ll start with how you need to backup your databases in preparation for a point in time restore.
Unlike SQL Server, the backups needed for point in time recovery are a little different in PostgreSQL from the full backups. Like SQL Server though, the mechanism that makes this work is a log. In the case of PostgreSQL, it’s the WAL, or Write Ahead Log. This plays an almost identical role to the database logs within SQL Server, recording all actions that modify the data or structures. This means our core process in getting the right kind of protection in place for our PostgreSQL database is the same, conceptually, as it is for SQL Server:
The backup process I showed in the first article on backups used p
[...]The reply is simple - none. pg_roles is view. But this view uses very important table pg_authid. For custom tables we can use a query:
SELECT conname, conrelid::pg_catalog.regclass AS ontable, pg_catalog.pg_get_constraintdef(oid, true) AS condef FROM pg_catalog.pg_constraint c WHERE confrelid = 'a'::pg_catalog.regclass AND contype = 'f' ORDER BY conname; ┌─────────────┬─────────┬─────────────────────────────────────┐ │ conname │ ontable │ condef │ ╞═════════════╪═════════╪═════════════════════════════════════╡ │ b_a_id_fkey │ b │ FOREIGN KEY (a_id) REFERENCES a(id) │ └─────────────┴─────────┴─────────────────────────────────────┘ (1 row)But this method doesn't work for system tables. These tables doesn't use explicitly defined foreign keys. We should to use different method. The system function pg_get_catalog_foreign_keys returns all referencies between system tables, and we can filter result:
SELECT * FROM pg_get_catalog_foreign_keys() WHERE pktable = 'pg_authid'::regclass; ┌─────────────────────────┬──────────────┬───────────┬────────┬──────────┬────────┐ │ fktable │ fkcols │ pktable │ pkcols │ is_array │ is_opt │ ╞═════════════════════════╪══════════════╪═══════════╪════════╪══════════╪════════╡ │ pg_proc │ {proowner} │ pg_authid │ {oid} │ f │ f │ │ pg_type │ {typowner} │ pg_authid │ {oid} │ f │ f │ │ pg_class │ {relowner} │ pg_authid │ {oid} │ f │ f │ │ pg_operator │ {oprowner} │ pg_authid │ {oid} │ f │ f │ │ pg_opfamily │ {opfowner} │ pg_authid │ {oid} │ f │ f │ │ pg_opclass │ {opcowner} │ pg_authid │ {oid} │ f │ f │ │ pg_language │ {lanowner} │ pg_authid │ {oid} │ f │ f │ │ pg_largeobject_metadata │ {lomowner} │ pg_authid │ {oid} │ f │ f[...]
In this post, you’ll learn a recipe that you can use to “shrink” a large table. This is a good fit when only a portion of the data is accessed, the big table has become unwieldy, and you don’t want a heavier solution like table partitioning.
This recipe has been used on tables with billions of rows, and without taking Postgres offline. How does it work?
These are the PostgreSQL instance details where this was tested:
READ COMMITTED
Let’s discuss some context.
Application databases commonly have a few tables that are much larger in size than the others. These jumbo tables have data for data-intensive features, granular information captured at a high frequency.
Because of that, the table row counts can grow into the hundreds of millions, or billions. Depending on the size of rows, the table size can be hundreds of gigabytes or even terabytes in size.
While PostgreSQL supports tables up to 32TB in size, working with large tables of 500GB or more can be problematic and slow.
Query performance can be poor, adding indexes or constraints is slow. Backup and restore operations slow down due to these large tables.
Large tables might force a need to scale the database server instance vertically to provision more CPU and memory capacity unnecessarily, when only more storage is needed.
When the application queries only a portion of the rows, such as recent rows, or rows for active users or customers, there’s an opportunity here to move the unneeded rows out of Postgres.
One tactic to do that is to DELETE
rows, but this is a problem due to the multiversion row design of Postgres. We’ll cover this in more detail in an upcoming post on massive delete operations.
Another option would be to migrate data into partitioned tables.
We aren’t going to cover table partitioning in this post, but let’
[...]Backups are dead. Now that I have your attention, let me clarify. Traditional backups have earned a solid reputation for their reliability over time. However, they are dead in the sense that a backup is essentially useless until it's restored—essentially "resurrected." In this post, we'll explore best practices for managing PostgreSQL snapshots and backups using pgBackRest. We will then provide some guidance of how you apply these techniques in Kubernetes using the Postgres Operator (PGO) from Crunchy Data. Whether you're overseeing a production environment, handling replicas, or refreshing lower environments, understanding how to effectively manage snapshots is key.
There are two effective methods for creating snapshots, but before we dive into those, let's address a common but ill-advised solution.
When working with PostgreSQL, it's crucial to avoid taking snapshots of the primary instance or running replicas for a couple of reasons:
To avoid these issues, I recommend two alternative approaches.
The first and preferred approach is to use pgBackRest for delta restores. When you snapshot a PostgreSQL instance, there's a risk of corrupt blocks being included, endangering your snapshots. pgBackRest adds a layer of protection by checking for corrupt blocks during the backup. If the previous ba
[...]Recently, I had to address a support case where a user was struggling to perform Point-in-Time Recovery in his PostgreSQL cluster using streaming replication. We have already discussed how to achieve a successful recovery in our past webinars, but the question remains: what is the impact on the standby servers? In this specific case, the user was unsure whether he needed to apply the Point-in-Time Recovery steps on the standby as well, and after doing so, the standby refused to connect to the primary.
In this post, we will examine this specific use case, perform a recovery on the primary, and explore the best approach to re-synchronize the standby servers.
For the purpose of this post, we will use 2 nodes called pg1 and pg2. Both are running on Rocky Linux 9.
On both pg1 and pg2 server, first configure the PGDG yum repositories:
$ sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/\ EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm $ sudo dnf -qy module disable postgresql
Then, install PostgreSQL and create a basic PostgreSQL cluster on pg1:
$ sudo dnf install -y postgresql16-server $ sudo PGSETUP_INITDB_OPTIONS="--data-checksums"\ /usr/pgsql-16/bin/postgresql-16-setup initdb $ sudo systemctl enable postgresql-16 --now $ sudo -iu postgres ps -o pid,cmd fx
As we are talking about Point-in-time Recovery, we need to store the WAL archives in a location shared across the nodes. In this example, we will create an archives
directory inside a /shared
NFS mount:
$ sudo -iu postgres mkdir /shared/archives
Configure the listening addresses and enable archiving in the postgresql(.auto).conf file:
listen_addresses = '*' archive_mode = on archive_command = 'test ! -f /shared/archives/%f && cp %p /shared/archives/%f'
Finally, restart the PostgreSQL service using:
$ sudo systemctl restart postgresql-16.service
On pg1, create a specific user for the replication:
$ sudo -iu postgres psql postgres=# CREATE ROLE re[...]
I previously blogged about ensuring that the “ON CONFLICT” directive is used in order to avoid vacuum from having to do additional work. You can read the original blog here: Reduce Vacuum by Using “ON CONFLICT” Directive
Now that Postgres has incorporated the “MERGE” functionality into Postgres 15 and above, I wanted to ensure that there was no “strange” behavior as it relates to vacuum when using merge. As you can see here, the “MERGE” functionality does perform exactly as expected. For example, when you attempt to have a merge where the directive is to try an insert first followed by an update, exactly one row is marked dead when the insert fails and the update succeeds.
/* Create the table: */
CREATE TABLE public.pk_violation_test (
id int PRIMARY KEY,
value numeric,
product_id int,
effective_date timestamp(3)
);
/* Insert some mocked up data */
INSERT INTO public.pk_violation_test VALUES (
generate_series(0,10000),
random()*1000,
random()*100,
current_timestamp(3));
/* Verify that there are no dead tuples: */
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup
FROM
pg_stat_all_tables
WHERE
relname = 'pk_violation_test';
schemaname | relname | n_live_tup | n_dead_tup
------------+-------------------+------------+------------
public | pk_violation_test | 100001 | 0
Then, create a simple merge and check the results:
WITH insert_query AS (
SELECT
0 AS id,
44.33893489873 AS value,
46 AS product_id,
now() AS effective_date) MERGE INTO pk_violation_test pkt
USING insert_query i ON pkt.id = i.id
WHEN MATCHED THEN
UPDATE SET
value = i.value, product_id = i.product_id, effective_date = i.effective_date
WHEN NOT MATCHED THEN
INSERT (id, value, product_id, effective_date)
VALUES (i.id, i.value, i.product_id, i.effective_date);
MERGE 1
And then check the dead tuple count:
Designing a database schema, whether for a new application or a new feature, always raises a lot of questions. The choices you make can have a big impact on how well your database performs and how easy it is to maintain and scale. Whether you’re just getting started with PostgreSQL or consider yourself a seasoned pro, it’s easy to rely on old habits or outdated advice. In this article, I want to take a fresh look at one of those topics that often sparks debate: the use of ENUMs in PostgreSQL.
I have to admit, not so long ago, I would advise "don't use ENUMs" without thinking about it too much. Relying only on random articles and some personal but outdated experience, this had been my go-to answer for some years. And while there were several limitations of ENUMs in PostgreSQL in the (distant) past, the support has improved a long time ago.
The improvements are so long-standing that:
ALTER TYPE ... ADD VALUE
in a transaction block was introduced in PostgreSQL 12 (2019).
And new features coming (at the time of writing this article) with PostgreSQL 17 allow the use of newly added values within the same transaction block (previously not possible without an explicit commit).
Therefore, I want to correct even myself and say—let's give ENUMs another chance. This article will go into detail and help us correctly decide when it makes sense to use them or not.
Every stored ENUM value occupies 4 bytes on disk. This is the size of the OID, representing the actual ENUM value, stored as a row within the pg_enum
table. You can see this yourself by querying the table directly, but it will come by default with no data defined.
# select * from pg_catalog.pg_enum;
oid | enumtyp
[...]
Timescale Inc. is a well known company in the PostgreSQL community, mostly because of their PostgreSQL extension TimescaleDB.
The survey for 2024 is open now until September 30th 2024. If you could spare some time to answer the questions, the community would appreciate it.
The results will be published shortly afer the survey has ended to everyone.
It helps everyone and every company and every user in the PostgreSQL community to get more information how PostgreSQL is used and why people are using it.
Also what is not well is important to know. Don’t spare your critics. Problems will only go away if they are known and can be solved.
Running ANALYZE
(either explicitly or via auto-analyze) is very important in order to have uptodate data statistics for the Postgres query planner. In particular after in-place upgrades via pg_upgrade
, ANALYZE
needs to be run in order to have any query statistics at all. As ANALYZE
samples only parts of the blocks in a table its I/O pattern looks more like random access than sequential read. Version 14 of Postgres has gained the possibility to use prefetching (if available, but this is the case on Linux) to tell the operating system kernel which blocks it will look at next. This is controlled via the maintenenance_io_concurrency
configuration parameter, which is set to 10 by default (contrary to effective_io_concurrency
, which is set to 1 by default).
In order to test and demonstrate the changes between version 13 and 14, we have done some quick benchmarks using the current maintenance releases (13.16 and 14.13) on Debian 12 with package from https://apt.postgresql.org. Hardware-wise, a ThinkPad T14s Gen 3 with a Intel i7-1280P CPU with 20 cores and 32 GB of RAM was used. The basis is a pgbench database, initialized with scale factor of 1000:
$ pgbench -i -I dtg -s 1000 -d pgbench
This creates 100 million rows and leads to a database size of around 15 GB. In order to have ANALYZE
do a bit more work, we increase default_statistics_target
from the default of 100 to the same value as the pgbench scale factor (i.e., 1000). This results in ANALYZE
scanning around 20% of all blocks. We then analyze the main pgbench table, pgbench_accounts
:
$ vacuumdb -Z -v -d pgbench -t pgbench_accounts
INFO: analyzing "public.pgbench_accounts"
INFO: "pgbench_accounts": scanned 300000 of 1639345 pages,
containing 18300000 live rows and 0 dead rows;
300000 rows in sample, 100000045 estimated total rows
Between runs, the file system page cache is dropped via echo 3 | sudo tee /proc/sys/vm/drop_caches
and all runs are repeated three times. The following
Let me present another “first patch” idea, related to a runtime stats on access to files storing data. Having this kind of information would be very valuable on instances with many files (which can happen for many reasons).
This is a very different area than the patch idea, which was about an extension. The runtime stats are at the core of the system, and so is the interaction with the file systems. But it’s still fairly isolated, and thus suitable for new contributors.
CloudNativePG is an open-source operator designed to manage PostgreSQL workloads on any supported Kubernetes cluster running in private, public, hybrid, or multi-cloud environments. CloudNativePG adheres to DevOps principles and concepts such as declarative configuration and immutable infrastructure.
The post Deploying Your First PostgreSQL Clusters on Kubernetes with Cloud Native Postgres appeared first on Stormatics.
In this article, I’ll explore how Kubernetes and CloudNativePG can be leveraged to streamline the testing of PostgreSQL patches, especially during a Commitfest. By automating the creation of PostgreSQL container images from any public Git repository, developers and reviewers can quickly and efficiently test patches within a Kubernetes environment. This approach not only saves time but also enhances the reliability of patches before integration into the main codebase.
In this introduction, we will explore the concept of VACUUM in PostgreSQL, its importance, and best practices for implementing effective vacuuming strategies to ensure your database remains efficient, performant, and reliable. Before we dive into strategies, problem detection and so on, it is important to understand why VACUUM is needed in the first place.
VACUUM will always exist in PostgreSQL and many myths have formed around it. Therefore, it is important to shed some light on why VACUUM is necessary. The reason why VACUUM is so relevant can be summed up in one word and one word only: “Transactions”.
Why is that the case? Consider the following example:
Transaction 1 | Transaction 2 |
SELECT count(*) FROM tab; | |
… returns 0 … | |
BEGIN; | |
INSERT INTO table VALUES (1), (2), (3); |
|
SELECT count(*) FROM tab; | SELECT count(*) FROM tab; |
… returns 3 … | … returns 0 … |
ROLLBACK; | ← 3 rows are now eliminated |
What is important to understand is that our two sessions will see different data. The left session will actually see 3 rows, while the right session, happening at the same time, won’t see any rows because the left transaction has not been committed yet.
The question naturally arising is: How can we get rid of those 3 rows that are subject to our ROLLBACK? In contrast to MS SQL and Oracle, PostgreSQL does NOT remove those rows from disk during a ROLLBACK. That is the reason why ROLLBACK is lightning fast in PostgreSQL and usually a real
[...]I wrote recently about the performance of PostgreSQL when compiled with different compilers and optimization levels. Another dimension in that evaluation is link-time optimization (LTO).
LTO allows the compiler to perform optimizations across source-file boundaries (more correctly, compilation unit boundaries).
Sounds useful, so let’s give it a try. I’m using the same setup as in the previous article, except this time I’m up to tag REL_16_4
. To enable LTO, I’m using the meson option -Db_lto
.
As before, the numbers in the following table are transactions per second (tps) as reported by pgbench (higher is better):
OS | Compiler | LTO | debugoptimized | release |
---|---|---|---|---|
macOS 14 | gcc-14 | false | 48063 | 48248 |
true | 49554 | 50427 | ||
clang-18 | false | 48143 | 47864 | |
true | 50014 | 49935 |
These results show that there is a positive effect from LTO, but it is quite limited. The improvement is about 3–5% tps, which is not nothing, but it’s also not overwhelming.
I also tried with the “release” buildtype in the hope that it would result in additional impact, maybe because the compiler is trying harder, but it was not there.
My recommendation here is the same as for the optimization level: Try to stick with what most people are using. If, say, some Linux distribution switched all their builds to use LTO, then ok. But otherwise, you don’t want to be the only one running a particular configuration in production.
These are my least favorite mysteries to resolve. I hate mysteries when it comes to performance problems. A performance mystery means I do not know how to make it work faster! However, I love solving mysteries—except when presented with a “sometimes it is slow” case.
Cases like “it was fast just yesterday, and now it is slow” are easier because you have a pretty good idea of what you might want to check: stale statistics, increased data volumes, “not fit into memory anymore” – all usual suspects. But what do you do with “sometimes”?! What are these “times”? The only thing you can suspect are prepared statements, and if that’s not the case, you are out of ideas.
***
Recently, I was an unfortunate receiver of one of these “sometimes it is slow” service tickets. The only advantage I had this time was a “new pgBadger” in place, so I could have a full view of what was happening inside the database during this slowness. Still, I could not explain what I saw. Yes, the process was definitely suboptimal, but I still could not explain what could make some queries wait for an extended time.
Finally, after examining multiple snapshots during the slow time, I figured it out. There were two contributing factors which had nothing to do with Postgres.
In addition, there are some Postgres specifics that contributed, so let’s look at what exactly happened.
Welcome to Part 6 of our series exploring the exciting new features anticipated in the official PostgreSQL 17 release. In this series, we delve into newly added features and discuss how they can benefit database developers and migration engineers transitioning to PostgreSQL 17 in future.
In case you missed the earlier parts:
Ready to enhance your PostgreSQL development skills? My course on PostgreSQL and PL/pgSQL will help you master database development. Click here to start your journey, and use code DBROOKIE20 to avail 20% off!
In databases, the login event is the first interaction a user has when connecting, making it an ideal candidate to trigger functionality based on different use cases. Logon triggers are used for various purposes, such as logging, enforcing business constraints, or populating global variables within database sessions.
If you are wondering till now how connections were logged so we have log_connections
and log_disconnections
database configuration to enable it at instance level.
Let’s explore Login Event Triggers with an example that mandates checks on application_name
for every connection to a database.
application_name
for Every Connection
In the example below, we create a Login Trigger that ensures all connections specify an application_name
. Setting application_name can be useful for code instrumentation, root cause analysis, and is part of the pg_stat_activity
view, which can be used for snapshots of database or connection processes at the application level.
CREATE OR REPLACE FUNCTION mandate_app_info() RETURNS event_trigger AS $$
BEGIN
IF nullif(current_setting
The annual PgUS membership meeting was prepared and conducted by: * Stacey Haysler (President) * Mark Wong (Treasurer & Expo Committee) * Michael Brewer (Secretary) * Jonathan S. Katz (Director & Conference Committee) * Elizabeth Garrett Christensen (Director) * Chelsea Dole (Diversity Committee) * Valeria Kaplan (User Group Committee) * James Myers (Volunteer coordinator) * Katharine Saar (Membership Coordinator)
Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.