PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past two months
Top teams
Number of posts in the past two months
Feeds
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
PostgreSQL on ARM-based AWS EC2

The expected growth of ARM processors in data centers has been a hot topic for discussion for quite some time, and we were curious to see how it performs with PostgreSQL. The general availability of ARM-based servers for testing and evaluation was a major obstacle. The icebreaker was when AWS announced their ARM-based processors offering in their cloud in 2018. But we couldn’t see much excitement immediately, as many considered it is more “experimental” stuff. We were also cautious about recommending it for critical use and never gave enough effort in evaluating it.  But when the second generation of Graviton2 based instances was announced in May 2020, we wanted to seriously consider. We decided to take an independent look at the price/performance of the new instances from the standpoint of running PostgreSQL.

Important: Note that while it’s tempting to call this comparison of PostgreSQL on x86 vs arm, that would not be correct. These tests compare PostgreSQL on two virtual cloud instances, and that includes way more moving parts than just a CPU. We’re primarily focusing on the price-performance of two particular AWS EC2 instances based on two different architectures.

Test Setup

For this test, we picked two similar instances. One is the older

m5d.8xlarge
, and the other is a new Graviton2-based
m6gd.8xlarge
. Both instances come with local “ephemeral” storage that we’ll be using here. Using very fast local drives should help expose differences in other parts of the system and avoid testing cloud storage. The instances are not perfectly identical, as you’ll see below, but are close enough to be considered same grade. We used Ubuntu 20.04 AMI and PostgreSQL 13.1 from pgdg repo. We performed tests with small (in-memory) and large (io-bound) database sizes.

Instances

Specifications and On-Demand pricing of the instances as per the AWS Pricing Information for Linux in the Northern Virginia region. With the currently listed prices,

m6gd.8xlarge
is 25% cheaper.
Graviton2 (arm) Instance
Instance
[...]
Posted by Mark Wong on 2021-01-21 at 21:23

At PGCon 2020, I gave a presentation proposing that the PostgreSQL community could use better benchmarking tools to serve the open source developers.  Some of us continued discussing that during the unconference shortly after.  Here I describe a proof of concept for an open source framework to run, but not necessarily limited to, user defined benchmarks that follows the model of loading a database and running a series of queries.

The goal of this framework is to have simpler way create, modify, and execute benchmarks.  I will try to demonstrate that by using dbgen and qgen provided by the TPC-H Benchmark (TM).  My intent is to use this well known benchmark only to show how the framework can be used to customize workloads for testing and development purposes.  For those not familiar, this models a data warehouse type of system that may execute large and complex reporting queries.

I'll give a couple of examples such as changing a table definition and how to modify a query.

The framework itself is just for executing a set of script.  I've named the user defined benchmark referenced here as pgnottpch.  I've created a brief 51 second demo of executing this benchmark.  It doesn't exactly match what I'm describing here anymore, since it's still very volatile, but it will hopefully still provide additional context around the examples shown here.

First let's review the information for the pgnottpch user defined benchmark (this is a snippet of the complete description):

$ tsbyo1 info ~/touchstone-udb/pgnottpch
...
This is **NOT** a TPC-H. Go to ``_ for a real TPC-H.

This is a set of scripts that perform some tasks that appear to be a TPC-H
benchmark but is absolutely not comparable to a TPC-H benchmark.
...
STAGES:
00createdb
01load
02power

I want to highlight the lines displayed under the STAGES heading towards the bottom of the output that correspond to subdirectories under pgnottpch containing executable scripts:

  • 00createdb simply runs the PostgreSQL createdb command
[...]

Building maps that use dynamic tiles from the database is a lot of fun: you get the freshest data, you don't have to think about generating a static tile set, and you can do it with very minimal middleware, using pg_tileserv.

However, the day comes when it's time to move your application from development to production, what kinds of things should you be thinking about?

As CYBERTEC keeps expanding, we need a lot more office space than we previously did. Right now, we have a solution in the works: a new office building. We wanted something beautiful, so we started to dig into mathematical proportions to achieve a reasonable level of beauty. We hoped to make the building not just usable, but also to have it liked by our staff.

I stumbled upon some old books about proportions in architecture and went to work. Fortunately, one can use PostgreSQL to do some of the more basic calculations needed.

Basic rules for beauty

This is of course a post about PostgreSQL, and not about architecture– but let me explain a very basic concept: golden proportions. Beauty is not random; it tends to follow some mathematical rules. The same is true in music. “Golden proportions” are a common concept: Let’s take a look:


test=# SELECT 1.618 AS constant,
     round(1 / 1.618, 4) AS inverted,
     round(pow(1.618, 2), 4) AS squared;
constant  | inverted | squared
----------+----------+---------
1.618     | 0.6180   | 2.6179
(1 row)

We are looking at a magic number here: 1.618. It has some nice attributes. If we invert it is basically “magic number – 1”. If we square it, what we get is “magic number + 1”. If we take a line and break it up into two segments we can use 1 : 1.618. Humans will tend to find this more beautiful than splitting the line using 1 : 1.8976 or so.
Naturally we can make use of this wisdom to create a basic rectangle:

In the case of our new building, we decided to use the following size:


test=# SELECT 16.07, 16.07 * 1.618;
?column?  | ?column?
----------+----------
16.07     | 26.00126
(1 row)

The basic layout is 16 x 26 meters. It matches all mathematical proportions required in the basic layout.

Inlining a semi-circle

To make the building look more appealing, we decided to add a small semi-circle to the rectangle. The question is: What is the ideal diameter of the semi-circle? We can use the same formula as before:


test=# WITH semi_ci
[...]

I decided to start out this year by looking into my notes from last year and extracting from them a small set of Postgres tips for you. This might not be, strictly speaking, fresh news… but I guess the average RAM of tech workers is about 64KB or so nowadays, so some repetition might not be such a bad thing.

Partial indexes

This is a huge one – and you’re really missing out if you’ve not heard of this feature or are not using it enough… as too often seems to be the case in my daily consulting work! Thus this was the first item on my list from last year. For larger amounts of data, partial indexes can literally be a life-saving trick when your back is already against the wall due to some unexpected increase in workload or disk usage. One of the reasons this feature is relatively unknown is that most other popular DBMS engines don’t have it at all, or they call it by another name.

The crux of the thing is super easy to remember – stop indexing the most common values! Since Postgres knows what your data looks like, it does not use the index whenever you search for a value that’s too common! As always, when you reduce indexes, you’re not just winning on the storage – you can also avoid the penalties incurred when you’re inserting or modifying the data. So: with partial indexes, when a new column contains the most common value, we don’t have to go and touch the index at all!

When does a value become “too common”? Sadly, there is no simple rule for this, as it depends on a couple of other config/data layout variables as well as the actual queries themselves – but roughly, starting at about 25-50% of the total rows, Postgres will tend to ignore the indexes.

Here is a small test table with 100 million rows to help you visualize the possible size benefits.


CREATE UNLOGGED TABLE t_event (
    id int GENERATED BY DEFAULT AS IDENTITY,
    state text NOT NULL,
    data jsonb NOT NULL
);


/* 1% NEW, 4% IN_PROCESSING, 95% FINISHED */
INSERT INTO t_event (state, data)
SELECT
    CASE WHEN random() < 0.
[...]
Posted by Andreas 'ads' Scherbaum on 2021-01-18 at 14:00
PostgreSQL Person of the Week Interview with Marc Linster: I was born and raised in Luxembourg. After getting a computer science degree at the University of Kaiserslautern, I joined the Gesellschaft für Mathematik und Datenverarbeitung (today part of Fraunhofer Gesellschaft) working on expert systems and AI. In 1992, I moved to the US, working in the software and consulting business, always with a focus on data and databases. After four years at Polycom, where I had the opportunity to work on their Video as a Service, I joined EDB in 2013.
On 17th of January 2021, Magnus Hagander committed patch: Add pg_stat_database counters for sessions and session time   This add counters for number of sessions, the different kind of session termination types, and timers for how much time is spent in active vs idle in a database to pg_stat_database.   Internally this also renames the … Continue reading "Waiting for PostgreSQL 14 – Add pg_stat_database counters for sessions and session time"
Posted by Egor Rogov in Postgres Professional on 2021-01-18 at 00:00

We started with problems related to isolation, made a digression about low-level data structure, then discussed row versions and observed how data snapshots are obtained from row versions.

Last time we talked about HOT updates and in-page vacuuming, and today we'll proceed to a well-known vacuum vulgaris. Really, so much has already been written about it that I can hardly add anything new, but the beauty of a full picture requires sacrifice. So keep patience.

Vacuum

What does vacuum do?

In-page vacuum works fast, but frees only part of the space. It works within one table page and does not touch indexes.

The basic, "normal" vacuum is done using the VACUUM command, and we will call it just "vacuum" (leaving "autovacuum" for a separate discussion).

So, vacuum processes the entire table. It vacuums away not only dead tuples, but also references to them from all indexes.

Vacuuming is concurrent with other activities in the system. The table and indexes can be used in a regular way both for reads and updates (however, concurrent execution of commands such as CREATE INDEX, ALTER TABLE and some others is impossible).

Only those table pages are looked through where some activities took place. To detect them, the visibility map is used (to remind you, the map tracks those pages that contain pretty old tuples, which are visible in all data snapshots for sure). Only those pages are processed that are not tracked by the visibility map, and the map itself gets updated.

The free space map also gets updated in the process to reflect the extra free space in the pages.

...

GNU Emacs is great! I can prepare my slides with PostgreSQL snippets of code and results.

PostgreSQL Literate Programming with GNU Emacs

What is literate programming? Literate Programming is a programming paradigm that makes you write a program in a more natural language, interleaving documentation and code together.
GNU Emacs allows literate programming by means of Org Mode and its module Org Babel.
I am already used to Org Mode, and I am already writing my own documentation, slides and papers with this great tool. But Org Babel can do much more for me: as you probably know I write several articles, papers, presentation for training events all related to PostgreSQL.
The classical workflow is:

  • write a slide or piece of document;
  • execute an SQL statement (e.g. in a terminal);
  • copy and paste the SQL statement into your slide or document;
  • copy and paste the result into your slide or document.
    One huge problem about the above is that every time you change the initial statement, you have to repeat the process copy and pasting the results, and this can lead to errors, inconsistencies, and duty on yourself to keep the documentation up to date. Moreover, imagine the output of a command changes from one version of PostgreSQL to another: you have to re-run every single command and repeat the copy and paste of the results.
    That’s too much!


Being BNU Emacs what it is, there’s a much more smarter way to do it!

Org Babel to the Rescue!

Org Babel is a module that allows Org Mode to execute a single snippet of code. The code is executed launching external processes, like interpreters (in the case of Perl, Python, etc.), shells or, in the case of our beloved database, psql.
Let’s see an example, imagine to write the documentation for a PostgreSQL transaction as follows:



* An example of transaction

The following is a PostgreSQL explicit transaction:

#+begin_src sql :engine postgresql :dbhost miguel :dbuser luca :database emacsdb
BEGIN;

CREATE TABLE emacs( 
[...]
A while ago someone wrote on irc that PostgreSQL has built in hard limit to number of partitions. This tweet was linked as a source of this information. Decided to check it. First, let's check the claim itself. Using simple psql script: $ CREATE TABLE test_ranged ( id serial PRIMARY KEY, payload TEXT ) partition … Continue reading "Are there limits to partition counts?"
Posted by Pavel Stehule on 2021-01-16 at 05:14
I finished and released new version of pspg pager. New release has possibility to export content in CSV, TCVS, formatted text or INSERT format to file or clipboard.

1. Introduction

There are many approaches for PostgreSQL to retrieve the data back to the user. Depending on the user’s input query, the planner module is responsible for selecting the most optimum approach to retrieve the requested data. Sequential scan is one of these approaches that is mostly selected when the user requests a large volume of data (for example, “SELECT * from tablename;”) or when a table has no index declared. Sequential scan is mostly handled by the table access method API within PostgreSQL and heap access method is the default one PostgreSQL uses today. In this short post, I will show you how sequential scan is done in the table access method API.

2. Table Access Method APIs in PostgreSQL

Pluggable table access method API has been made available since PostgreSQL 12, which allows a developer to redefine how PostgreSQL stores / retrieves table data. This API contains a total of 42 routines that need to be implemented in order to complete the implementation and honestly it is no easy task to understand all of them and to implement them. This API structure is defined in tableam.h under the name typedef struct TableAmRoutine

Today I will describe the routines related to sequential scan and I hope it could help you if you are someone looking to create your own table access method.

3. Sequential Scan Overall Call Flow

Few of the 42 routines will be called by executor just to complete a sequential scan request. This section will describe these routines in the order they are called.

3.1 relation_size

uint64        (*relation_size) (Relation rel, ForkNumber forkNumber);

relation_size is the first routine to be called and it is relatively simple. The caller will expect the routine to return the total size of the relation described by rel and forkNumber. The default heap access method will simply invoke the storage manager smgr to find the number of data blocks this particular relation physically occupies on disk and multiplies that number with the size of each block BL

[...]
Every now and then someone complains (me included) that PostgreSQL doesn't have job scheduler. This is true, to some extent. I'll try to show you couple of approaches to solving this particular problem. First option, available to anyone (as long as you're running on some kind of Unix/Linux) is system Cron. Virtually every Unix/Linux system … Continue reading "How to run some tasks without user intervention, at specific times?"

A nice addition to the pgenv PostgreSQL binary manager.

pgenv special keywords: earliest and latest

I recently added support for two different keywords in pgenv: earliest and latest.
The idea is quite simple: instead of having to specify each time a PostgreSQL version number to work on, you can now specify one of the above keywords to jump immediately to the oldest or newest PostgreSQL version you have installed. Of course, the newest PostgreSQL version is the most recent on a version number basis (not installation time), and on the other hand the oldest is the one with the lesser version number among those installed.
Let’s understand the concept with an example:



% pgenv versions
      12.1      pgsql-12.1
      12.3      pgsql-12.3
      12.4      pgsql-12.4
      13.0      pgsql-13.0
      9.6.20    pgsql-9.6.20



Among the versions installed above, we have that:

  • 9.6.20 is the oldest one, and therefore is mapped to earliest;
  • 13.0 is the newest one, and therefore is mapped to newest. It is quite easy to demonstrate this by means of use:



% pgenv use earliest

PostgreSQL 9.6.20 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log



As you can see, earliest has been resolved to version 9.6.20; on the other hand latest is going to be resolved to 13.0:



% pgenv use latest

PostgreSQL 9.6.20 stopped
PostgreSQL 13.0 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/pgsql/data/server.log



But that is not enough: you can also narrow down the scope of versions to a specific major number. For instance, in the 12 branch we have installed 12.1, 12.3 and 12.4, that means that 12.1 is oldest version in the twelve branch, as far as 12.4 is the newest one. You can filter by a version number specifying the major version number after the earliest or latest keywords:



% pgenv use latest 12

PostgreSQL 13.0 stopped
PostgreSQL 12.4 started
Logging to /home/luca/git/misc/PostgreSQL/pgenv/
[...]
Posted by Federico Campoli on 2021-01-13 at 00:00

For obvious reason the FOSDEM this year is an online event. The staff is building from scratch an infrastructure in order to deliver the speaker’s videos in a virtual environment.

The catch is that all the talks must be pre recorded and uploaded via pentabarf, the software historically used by FOSDEM to manage the talk submissions.

What follows is my experience in recording,uploading and submitting the video for my upcoming talk.

On 6th of January 2021, Tom Lane committed patch: Add idle_session_timeout.   This GUC variable works much like idle_in_transaction_session_timeout, in that it kills sessions that have waited too long for a new client query. But it applies when we're not in a transaction, rather than when we are.   Li Japin, reviewed by David Johnston … Continue reading "Waiting for PostgreSQL 14 – Add idle_session_timeout."

Greetings friends! We have come to our final blog post in my series about the data science workflow using PostgreSQL. In the last blog post, we used PL/R to create a function which returns the output from a logistic regression model trained on our fire data. We then took that model object and stored it into a separate table.

Today we are going to finish up by showing how to use that stored model to make predictions on new data. By the way, I did all of the Postgres work for the entire blog series in Crunchy Bridge. I wanted to focus on the data and code and not on how to run PostgreSQL.

On 6th of January 2021, Tomas Vondra committed patch: Report progress of COPY commands   This commit introduces a view pg_stat_progress_copy, reporting progress of COPY commands. This allows rough estimates how far a running COPY progressed, with the caveat that the total number of bytes may not be available in some cases (e.g. when the … Continue reading "Waiting for PostgreSQL 14 – Report progress of COPY commands"

Data types are an important topic in any relational database. PostgreSQL offers many different types, but not all of them are created equal. Depending on what you are trying to achieve, different column types might be necessary. This post will focus on three important ones: the integer, float and numeric types. Recently, we have seen a couple of support cases related to these topics and I thought it would be worth sharing this information with the public, to ensure that my readers avoid some common pitfalls recently seen in client applications.

Creating sample data

To get started, I’ve created a simple table containing 10 million rows. The data types are used as follows:


test=# CREATE TABLE t_demo (a int, b float, c numeric);
CREATE TABLE
test=# INSERT INTO t_demo
SELECT random()*1000000, random()*1000000, random()*1000000
FROM generate_series(1, 10000000) AS id;
INSERT 0 10000000
test=# VACUUM ANALYZE;
VACUUM
test=# \timing
Timing is on.

After the import, optimizer statistics and hint bits have been set to ensure a fair comparison.

Float vs. numeric

While the purpose of the integer data type is clear, there is an important difference between the numeric type and the float4 / float8 types. Internally, float uses the FPU (floating point unit) of the CPU. This has a couple of implications: Float follows the IEEE 754 standard, which also implies that the rounding rules defined by the standard are followed. While this is totally fine for many data sets, (measurement data, etc.) it is not suitable for handling money.
In the case of money, different rounding rules are needed, which is why numeric is the data type you have to use to handle financial data.

Here’s an example:


test=# SELECT a,
b,
c,
a + b,
a + b = c
FROM (SELECT 0.1::float8 a,
0.2::float8 b,
0.3::float8 c
) AS t;
a    | b   | c   | ?column?            | ?column?
-----+-----+-----+---------------------+----------
0.1  | 0.2 | 0.3 | 0.30000000000000004 | f
(1 row)

As you can see, a floating point number always uses ap

[...]

A table in PostgreSQL has a relfilenode value, which specifies the file name of the table on disk (except foreign table and partition table). In general, this value can be found in the relfilenode field of the pg_class table, but there are some specific tables whose query result in the relfilenode field is 0. This blog will explore the kernel processing of these special tables relfilenode.

Relfilenode of ordinary table

When a normal table is created in PostgreSQL, the relfilenode value of the table is stored in pg_class system table. As it can be seen in this example that when a table is created, it’s OID and relfilenode value are both 16808. You can also find a file on the disk with the same name as the Relfilenode node value, in-fact the data inserted for the table is actually stored in the same file.

postgres=# create table t2(i int);
CREATE TABLE
postgres=# select oid,relname,relfilenode from pg_class where relname = 't2';
  oid  | relname | relfilenode 
-------+---------+-------------
 16808 | t2      |       16808
(1 row)

postgres=# \q
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16808 
-rw-------+ 1 movead movead 0 12月 31 17:11 ../data/base/12835/16808
movead@movead-PC:/h2/pgpgpg/bin$

After we perform operations such as truncate, vacuum full, etc. on a table, the data in this table will be rewritten and the value of relfilenode for this table will be changed. The following test shows that after truncate, the relfilenode of the t2 table has changed from 16808 to 16811.

postgres=# truncate t2;
TRUNCATE TABLE
postgres=# select oid,relname,relfilenode from pg_class where relname = 't2';
  oid  | relname | relfilenode 
-------+---------+-------------
 16808 | t2      |       16811
(1 row)

postgres=# checkpoint;
CHECKPOINT
postgres=# \q
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16808
ls: 无法访问'../data/base/12835/16808': 没有那个文件或目录
movead@movead-PC:/h2/pgpgpg/bin$ ll ../data/base/12835/16811
-rw-------+ 1 movead movead 0 12月 31 17:16 ../data/base/12835/16811
movead@movead-
[...]
Posted by Andreas 'ads' Scherbaum on 2021-01-11 at 14:00
PostgreSQL Person of the Week Interview with Gunnar ‘Nick’ Bluth: I’m Gunnar (aka. “Nick”) Bluth, born in ‘73, married with two grown-up kids. I grew up in Münster/Westfalen in Germany, where I repatriated to in 2016 after living near Frankfurt/M. for 13 years. So I’ll probably also rot here ;-)
Posted by Haki Benita on 2021-01-10 at 22:00

If you work with databases you are probably familiar with B-Tree indexes. They are used to enforce unique and primary-key constraints, and are the default index type in most database engines. If you work with text, geography or other complex types in PostgreSQL, you may have dipped your toes into GIN or GIST indexes. If you manage a read intensive database (or just follow this blog), you may also be familiar with BRIN indexes.

There is another type of index you are probably not using, and may have never even heard of. It is wildly unpopular, and until a few PostgreSQL versions ago it was highly discouraged and borderline unusable, but under some circumstances it can out-perform even a B-Tree index.

In this article we re-introduce the Hash index in PostgreSQL!

This article was co-authored by Michael from pgMustard

<small>Photo by <a href="https://unsplash.com/photos/lRoX0shwjUQ">Jan Antonin Kolar</a></small>
Photo by Jan Antonin Kolar
Table of Contents

Hash Index

Just like the name suggests, Hash indexes in PostgreSQL use a form of the hash table data structure. Hash table is a common data structure in many programming languages. For example, a Dict in Python, a HashMap in Java or the new Map type in JavaScript.

To understand how you can benefit from Hash indexes, it's best to understand how they work.

Hash Function

Hash indexes use a hash function.

[...]

How to search directly into the PostgreSQL documentation from your Plasma desktop.

If you, like me, are addicted to Plasma, the KDE desktop, you probably already know about krunner, an application launcher on steroids.
krunner allows you to quickly launch, kill, switch to and manage applications, as well as executed computations and, most notably web searches. In fact, krunner exploits the Konqueror shortcuts for web searches. Konqueror is the default web browser for KDE/Plasma (since KDE version 2), and allows for a quick customization of shortcut that enable you to redirect a string thru a search engine. As an example, by default Konqueror has the dd and the gg shortcuts: the former enbles the search of the remaining part of the string thru DuckDuckGo, while the latter thru Google.
So, what does it take to get krunner integrated with the PostgreSQL official documentation search engine?
There is no much work to do, after all, and in fact it does suffice to: 1) create a new Konqueror shortcut; 2) no, there are no other steps involved!
The good news is that you can configure whatever you want by the krunner interface itself.

Configure krunner

First of all, launch krunner by hitting ALT + F2 or ALT + , then click on the setup icon on the left of the bar





In the dialog window, scroll to the Web Shortcuts line and click on the configure icon.





In the opened dialog, after having searched for the key sequence you want to insert, click on the New button to create a new shortcut.





Fill the dialog as you find appropriate, but with regard to the Shortcut URL place https://www.postgresql.org/search/?q= and then hit the button on the right to insert the query parameters (\{@}), so that the ending result is https://www.postgresql.org/search/?q=\{@}.
Place a shortcut in the Shortcuts entry, separaed by comma, for example pg, then postgres and last postgresql, so that you will b

[...]

How to search directly within the PostgreSQL documentation from your Firefox web browser.

The Firefox web browser supports several search engines, extensions by means of which you can insert a search string and get it passed to a specific site for search.
It is possible to customize Firefox to search for a particular string within the PostgreSQL official documentation: the idea is to instrument the web browser to redirect the searching for thru the PostgreSQL web site via a GET URL.
In order to achieve this, you need to install a customizable search engine, and then configure the shortcuts for enabling the web engine access.

Custom Search Engine Setup

The first step consists of installing the Custom Search Engine to your Firefox web browser.
Then, clicking on the main Firefox menu (the hamburger icon), select the Add-ons entry and then go to the extensions menu: you should see the new searching engine there. Check the engine is active and then click on the three dots button and select Preferences:





In the opened screen, edit a line to add the following details:

  • key, I use pg as the default prefix to indicate I’m going to specify a PostgreSQL documentation search;
  • Search Engine Name, set to PostgreSQL or any name it makes sense to you;
  • URL, you have to set it to https://www.postgresql.org/search/?q={searchTerms}, where {searchTerms} is going to be replaced by firefox with the searching keywords;
  • Description, whatever it makes sense to you, for example PostgreSQL Official Documentation.


As you can imagine, the important parts are the key and the URL. Note that you can also add specific PostgreSQL versions by changing the URL to include a version number, do a few searches on the official web site and inspect the URL for other arguments.
Once you have done, click on the button Save Preferences and then close the tab.





Searching into the documentation

[...]
A bit ago I wrote a blog post that was supposed to show how to keep number of rows in table to N per some category. Unfortunately, I overlooked a problem related to concurrency. As Mikhail wrote, we will get problems if we'd do operations from multiple connections, at once-ish. Let's see it: CREATE TABLE … Continue reading "How to limit rows to at most N per category – fix"

Crunchy Data is pleased to announce the publication of the  Crunchy Data PostgreSQL Security Technical Implementation Guide (STIG) by the United States Defense Information Systems Agency (DISA). Crunchy Data collaborated with DISA to make PostgreSQL the first open source database to provide a published STIG in 2017, and this new STIG reflects Crunchy Data's ongoing collaboration with DISA to provide enhanced security guidance as PostgreSQL continues to advance and evolve.

I added RHEL 8 - ppc64le support to PostgreSQL YUM repo for all the supported PostgreSQL versions as of today.

To use the repo and install PostgreSQL on this platform, please follow these steps: Continue reading "Announcing PostgreSQL RPM repository for RHEL 8 - ppc64le"
In case you're not familiar with this site, why-upgrade.depesz.com shows you aggregated changelog between any two releases, with optionally searching for some keywords. Yesterday azeem on irc pointed me towards a problem on why-upgrade.depesz.com. Specifically, when displaying changes from 9.5.24 to 13.1 site showed 30 security fixes. But in reality there should be only 10. … Continue reading "Fixes on why-upgrade.depesz.com"

Greetings friends! We have finally come to the point in the Postgres for Data Science series where we are not doing data preparation. Today we are going to do modeling and prediction of fire occurrence given weather parameters… IN OUR DATABASE!

PostgreSQL Database Security

When we are talking about database security, it encompasses different modules of different areas. It is a very vast topic because, with databases, we need need to secure the whole ecosystem, not just the database node. The figure below shows the major breakdown of the “parts” which need to be secure. It is clearly evident that the database itself is just the 1/6th part of that. You need to secure your (1) Network (2) Network Node (3) Data (4) Database (5) Users and (6) Application. A secure database means it is secure by all means.

 

PostgreSQL Security

The blog series will cover all the topics related to PostgreSQL Database Security.

PostgreSQL Database Security

The database is normally divided into Authentication, Authorization, and Accounting (AAA). The database system needs to authenticate a user, secondly, authorize what a user can do with the database, and thirdly, account for what a user did with the database. PostgreSQL is considered to be one of the most secure databases, providing AAA capabilities.

secure PostgreSQL

 

Authentication

Authentication means which user is allowed to access the database. In technical terms, which user can “log in” to the database system. PostgreSQL has a strong concept of what the authentication process should be. PostgreSQL authentication methods divide into three categories: (1) PostgreSQL Internal Authentication (2) Operating System Authentication and (3) External Server Authentication. Figure 3 shows all the supported authentication methods sorted by categories.

 

PostgreSQL Authentication

Authorization

The second part of AAA is authorization, which means after login what activity a user can perform in the database. PostgreSQL has different controls to manage the users –  what a user can and cannot do. This control can be object-based or can be row-based. Here are the main categories of PostgreSQL authorization features:

  • Roll, Users, and Groups
  • GRANT/ REVOKE
  • Row Level Security

Accounting

The last and third part of AAA is accounting. After “login” to th

[...]