PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Holly Orr in OpenSCG on 2017-06-22 at 18:15

We are pleased to announce pgAdmin3 LTS is now available on OSX and Windows for PostgreSQL 10!

As v1.0 of pgAdmin4 was released in September of 2016, the pgAdmin Development Team decided to no longer support pgAdmin III with any more fixes. We believe that pgAdmin3 usage should fade out over time and have forked the code to include basic support for PostgreSQL 9.6+.

Try it out!

Install pgAdmin3 LTS by BigSQL

If you haven’t already installed PostgreSQL, start here.

Via Command Line

Windows users don’t prefix the pgc command with ./ as shown in the following commands:

cd <directory of installation>
./pgc install pgadmin3
./pgc start pgadmin3

Via GUI

If you haven’t already, install pgDevOps.

In pgDevOps, navigate to Package Manager, select the pgAdmin3 component, and install.

Already have pgAdmin3 installed?

cd <directory of installation>
./pgc update
./pgc upgrade pgadmin3

Wanna delete pgAdmin3?

cd <directory of installation>
./pgc remove pgadmin3

As always, give us your feedback so that we can continue to support and respond to the needs of the PostgreSQL community.

I assume that you're familiar with explain.depesz.com and paste.depesz.com services. From today they are also available as services in TOR, using respectively: explain4pg4j5wbw.onion and paste4sql64vzyry.onion (of course you need to have TOR Browser or some other way to browse darknet). Why I did that? Well, I believe that TOR is one of the greatest inventions […]
Posted by Bruce Momjian in EnterpriseDB on 2017-06-21 at 16:15

You might be aware of the interval data type, which allows mathematical operations on date, time, and timestamp values. This can lead to odd behavior, but this email posting showed me a new oddity when dealing with months containing a different number of days. First, let's summarize how many days are in the first five months of 2017:

2017-01     31
2017-02     28
2017-03     31
2017-04     30
2017-05     31

Continue Reading »

The third PGConf US event of the year is happening July 13 - 14th! Tickets are now available!
This year the event is at Huntsman Hall located at The Wharton School. There will be two days of awesome Postgres content. On July 13th we are pleased to offer two great training sessions by two of the most veteran members of the PostgreSQL community, Bruce Momjian and Joshua Drake. Between them they have well over 40 years of hands-on experience with PostgreSQL. 

On July 14th we have eight first-rate sessions covering everything from Postgres performance to development. Check out the full schedule here:
Sponsorship opportunities for the conference are still available. If you use Postgres, support Postgres, or if you are looking to recruit a local Postgres expert, this is a great opportunity. Become a part of our dynamic and growing ecosystem! The prospectus is available at:
We would not be able to produce PGConf Local: Philly 2017 without the generous support from our sponsors:

For more information about PGConf US please visit: PGConf US


PGConf US: People, Postgres, Data



The GDAL community just announced the release of GDAL 2.2… and BigSQL just made it easier to implement!

Install it Now

We support Diegos and Yodas with 2 ways to install:

Option 1: Install with the pgDevOps GUI

For GIS analysts (scientists, statisticians, analysts, etc) who find themselves as the de facto DBA (among other things).

“I don’t have time for this! I’m saving this sea turtle. Just give me a GUI!”

1. Go to www.bigsql.org/postgresql/installers and download and run the installer for your operating system.

2. Make sure to check the box for including the pgDevOps component with your install.

3. Open pgDevOps in your web browser:

    <ip of machine>:8051

If installing on local machine:

    localhost:8051  

4. In the pgDevOps dashboard click the Package Manager icon.

5. Click the PostgreSQL installation you want to install on.
6. Click on the PostGIS icon and follow the instructions to install.

That’s it!

Option 2: Install via Command Line and PGC

For DBAs who find themselves supporting these strange spatially enabled databases.

“Real men only use command line! And what the hell is GIS?”

  1. Install pgc via command line to a sandbox.
    *Note: this command will install the BigSQL distribution into the directory you are currently located.

    MAC / Linux:

    python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)" 
    
    

    Windows:

    @powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"
    
    
  2. Get into the product home directory by navigating via command line to the bigsql directory. Windows users don’t prefix the pgc command with ./ as shown in the below examples:

    cd bigsql
    
    
  3. Using the pgc command line tool, run the update command to get the lastest distribution:

    ./pgc update
    
    
  4. Then run the install, init, and start commands on pg96:

    ./pgc install pg96 
    ./pgc start pg96
    
    
  5. Next, install postgis:

    ./pgc install postgis23-pg96
    
    

That’s it!

The GDAL_DATA Directory Path

In order for all your Po

[...]
Posted by Bruce Momjian in EnterpriseDB on 2017-06-19 at 14:45

If you often use the TIMESTAMP data type, you might not be making full use of it. In these queries:

CREATE TABLE tztest (x TIMESTAMP);
 
INSERT INTO tztest VALUES (CURRENT_TIMESTAMP);
 
SELECT * FROM tztest;
             x
----------------------------
 2016-10-25 18:49:20.220891
 
SHOW timezone;
  TimeZone
------------
 US/Eastern
 
SET timezone = 'Asia/Tokyo';
 
SELECT * FROM tztest;
             x
----------------------------
 2016-10-25 18:49:20.220891

Continue Reading »

Posted by Dimitri Fontaine on 2017-06-19 at 11:30

Business logic is supposed to be the part of the application where you deal with customer or user facing decisions and computations. It is often argued that this part should be well separated from the rest of the technical infrastructure of your code. Of course, SQL and relational database design is meant to support your business cases (or user stories), so then we can ask ourselves if SQL should be part of your business logic implementation. Or actually, how much of your business logic should be SQL?

Posted by Federico Campoli on 2017-06-19 at 09:00
After another round of bug fixes I've released the version 1.3 of my pet project pg_chameleon.

The package is available on pypi as usual.

The changelog is available here.

If you have any question/issue to discuss I created a community on gitter.

Please join!

I've also added a RELASE_NOTES file to explain the changes. As this release changes the relay storage and you plan to upgrade please have look to the file.

The initial implementation for the relay data was to have two log tables t_log_replica_1 and t_log_replica_2 with the replica process accessing one table at a time.

This approach allows autovacuum to take care of the unused partition meanwhile the other is written.

The method worked fine with only one replica worker. However as the flip flop between the tables is calculated indepentently for each source this could cause unwanted bloat on the log tables if several sources are replicating all togheter.

In this scenario autovacuum will struggle to truncate the empty space in the table's end.

The pg_chameleon version 1.3 implements the log tables per source. Each source have a dedicated couple of tables still inherited from
the root partition t_log_replica.


The schema upgrade happen automatically at the first run.
Because this one involves a data copy it could take more time than the usual. If the process seems frozen do not stop it otherwise you may lose your replica setup .

Those are the upgrade steps you should follow.
  • Stop all the replica sources. The show_status command must show all the rows in stopped status
  • Take a backup of the schema sch_chameleon with pg_dump
  • Upgrade pg_chameleon with pip install pg_chameleon --upgrade
  • Run chameleon.py upgrade_schema --config <your_config> --debug
  • When the upgrade is finished start the replica process as usual
If something goes wrong in the upgrade process you shall restore the sch_chameleon’s backup, then you should downgrade the installation to pg_chameleon 1.2 and start the replica as usual.

PostgreSQL has a feature called dollar-quoting, which allows you to include a body of text without escaping the single quotes. This feature has existed for quite some time. You've probably seen this in action when defining functions for example:

CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
$$
SELECT 'Hello world. My name is ' || param_your_name || '.';
$$
language sql STRICT;

Which is easier to read, than the equivalent escape quoted function:


CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
'
SELECT ''Hello world. My name is '' || param_your_name || ''.'';
'
language sql STRICT;

Continue reading "Dollar-quoting for escaping single quotes"
Posted by Federico Campoli on 2017-06-17 at 08:57
Last 30th of May I had the honor to talk at the second Estonia PostgreSQL User Group.

My talk was about my pet project pg_chameleon and it went very well.

The audience was about 40 people highly interested in the topic and the questions were  all around.

I'm impressed by the Estonian PostgreSQL User Group.
Their organizational skills are amazing, the members interest is very high and they are fantastic friends.

If you are curious about the talk there is the recording on EEPUG's youtube channel.



Posted by Bruce Momjian in EnterpriseDB on 2017-06-16 at 15:30

PL/Java has been around since 2005, but it has regularly struggled to gain users. Unfortunately, a lot of these problems are specific to the Java language and hamper its adoption.

First, there are a limited number of people who know both Java and the Postgres backend code. Fortunately Chapman Flack has recently resumed PL/Java development.

Second, there is the deployment complexity of binding PL/Java to a Java Runtime Environment (JRE) in an easily-deployed way. This 2014 email thread discusses the problems of JRE packaging and installation.

Continue Reading »

Posted by Shaun M. Thomas in 2ndQuadrant on 2017-06-16 at 15:00

Testing within the context of a database can sometimes seem a bit overwhelming. Not only is it necessary to create a reasonable set of test data to bootstrap multiple tables, but we need some reliable invocation system to compare query, view, and function output to expected results. And that is no small feat. For a very long time, the "easiest" way to test end-user database objects involved building a Makefile that included PGXS. Then we had to create the tests, run the tests, and copy over the output into static expected files. Modifying any test always meant updating the static output file as well.

It’s great that the tools are there, yet why do we need to be halfway to a Postgres developer in order follow good QA practices with Postgres? Well, maybe we don’t. It turns out, like many things in the Postgres world, that there’s an extension for that. These days, all the cool kids are using pgTAP. Not only does pgTAP make it easier to write tests for Postgres, but its design generally encourages test-driven development. Regardless of how any of us might view that development paradigm, having the availability makes Postgres a more welcoming platform for those who fully embrace it.

To that end, let’s say we have our trusty sensor_log table and we want to write a get/set function pair to control and access its contents. As always, the sensor_log table looks like this:

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL
);

CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

We also need to install pgTAP itself. That part at least, is as easy as the usual extension install. For RHEL and Debian system users, it’s not even necessary to build anything, since pgTAP is in the official Postgres YUM and APT repositories. After that, it’s just a quick extension install:

CREATE EXTENSION pgtap;

In the spirit of TDD, we should write the

[...]
At OmniTI we love solaris, my personal favourite features are ZFS and DTrace. Unfortunately not many run postgres on solaris so i have decided to implement similar features in linux. Instead of Dtrace i'll install BPF, in-kernel bytecode that can be used for tracing introduced in recent kernels (4.X). 
This post will be a part of a three series post. In this post we'll start with setup, in part #2 with ZFS and how to use it for backups / snapshots. In part #3 we'll dig into BPF a bit more.

Step 1 is to setup a new ubuntu. I setup a VM using ubuntu-16.04.2-server-amd64.iso.

As root :
Add the repo for bcc :
> echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" | sudo tee /etc/apt/sources.list.d/iovisor.list

sudo apt-get update

Install all necessary and some optional packages :

> apt-get install -y sudo wget apt-transport-https joe less build-essential libreadline-dev \
zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev openssh-server screen git \
binutils bcc bcc-tools libbcc-examples python-bcc zfsutils-linux \
systemtap systemtap-sdt-dev

Make sure kernel is properly compiled :

> cat /boot/config-`uname -r` |grep BPF
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_HAVE_BPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m

Test BCC (stands for BPF Compiler Collection)

> python /usr/share/bcc/examples/tracing/bitehist.py
Tracing... Hit Ctrl-C to end.
^C
     kbytes              : count     distribution
         0 -> 1          : 7        |************                            |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 22       |****************************************|
         8 -> 15         : 19       |**********************************      |
        16 -> 31         : 8        |**************                          |
        32 -> 63         : 6        |**********                              |
        64 -> 127        : 1        |*[...]
Posted by Markus Winand on 2017-06-15 at 00:00

In December 2016, ISO released a new version of the international SQL standard (ISO/IEC 9075:2016). It supersedes the previous version from 2011.

This article is a brief overview of the new features being introduced into the SQL language. Strictly speaking, this article covers the additions to part 2 of the standard (SQL/Foundation), i.e. the most commonly used part.

This article also shows the availability of these features among six major databases. Note that respective figures—shown below—only reflect whether or not the databases support the features in the way described by the standard. For example, an X in the JSON row does not mean the database has no JSON support—it simply means the database doesn't support the JSON features described by the SQL standard. As a matter of fact, all tested databases support JSON in some way—but not necessarily in the way described by the standard.0

Overview of Availability of new SQL:2016 features

Before going through all the shiny new features, let's look at some trivia: part 2 of the SQL:2016 standard has 1732 pages—that's 260 pages (~18%) than the 2011 edition. It introduces 44 new optional features (+14%). Let's take a look at them…

Row Pattern Recognition

Row pattern recognition captures groups of rows that follow a pattern. Rows in matched groups can be filtered, grouped, and aggregated. The pattern itself is described with a simple regular expression syntax.

The main use of row pattern recognition is to check time series for patterns. However, the match_recongize clause combines aspects of the where, group by, having and over clauses (window functions) so it is also useful in many other cases.

I have given a presentation about row pattern recognition. It discusses several examples in two implementation variants: with and without the new match_recongize clause. The examples cover some typical use cases, and also some atypical use cases for row pattern recognition:

  • Consecutive events: identifying sessions in a web-log; tolerating gaps in sequences

  • Top-N per group (might be faster than window functions!)

  • Time in

[...]
Posted by Bruce Momjian in EnterpriseDB on 2017-06-14 at 14:30

The use of server-side logic, particularly stored procedures, has been a highly contentious topic among database professionals for decades. The question has always been what amount of logic should be encoded in the database vs. in client applications or application servers.

Beyond using the database as a simple data store, there are three levels in which logic can be added to Postgres:

  1. Database constraints, e.g. unique, check, foreign keys
  2. Triggers
  3. Manually-called stored procedures, i.e., functions not called as triggers

Let's look at each of these in turn:

  1. Database constraints are often seen as optional by database application developers. They don't really do anything except prevent invalid data from being entered into the database. The overhead of performing these checks is often seen as negative, particularly foreign key checks. (DEFAULT clauses that call stored procedures are similar to triggers; the serial data type uses this.)
  2. Triggers are stored procedures assigned to tables that are executed automatically during INSERT, UPDATE, DELETE, or TRUNCATE commands. They can be triggered before the command runs, usually for complex constraint checks or to modify incoming data, e.g. capitalization. When triggered after the command, they perform post-command operations, such as adding entries to a log table.
  3. Manually-called stored procedures are functions called usually in WHERE clauses or in the target list of SELECT queries. Stored procedures, including ones used as triggers, can be written in many languages.

Continue Reading »

Posted by Douglas Hunley in OpenSCG on 2017-06-14 at 12:40

If you’ve followed my previous posts (here and here), then you already have one or more versions of PostgreSQL installed on your Mac. Maybe these are solely for test or dev purposes and you don’t really care about any of the data therein, but if you do, let me guide you to pgBackRest.

pgBackRest aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.

Instead of relying on traditional backup tools like tar and rsync, pgBackRest implements all backup features internally and uses a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows for better solutions to database-specific backup challenges. The custom remote protocol allows for more flexibility and limits the types of connections that are required to perform a backup which increases security.

pgBackRest is written in Perl, but don’t hold that against it. As of the 1.19 release, pgBackRest can now use S3 buckets as the storage backend. I really like pgBackRest and tend to use it for myself and customers over any of the other tools in the PostgreSQL ecosphere. So, let’s get started by downloading the latest release from their site, and then installing it. For some reason, no one has added pgBackRest to Homebrew yet (someone, pls!) so let’s do it the manual way:

true

(Keep in mind that I already had Perl setup to connect to PostgreSQL for other uses. You might need to install DBD::Pg.)

Now that pgBackRest is installed, let’s configure it. First, we’ll want to set some of the global properties that affect all pgBackRest operations:

true

As you can see, we set the following:

  • force the log level for all console output to ‘info’
  • define the S3 bucket we want to use
  • define the S3 endpoint to connect to
  • define our S3 key
  • define our S3 secret key
  • set which region our bucket is in
  • tell pgBackRest that we’re using S3 as the backend
  • configure retention of full backups
  • tell pgBackRest to issue a CHECKPOINT so backups can start right away instead of waiting for
[...]
PostgreSQL 10 is full with a lot of big, new and exciting features. Declarative partitioning is one of those. It is something users have wanted for years. During PGCon 2017, it was a hot topic of discussion. People wanted to know more about the feature, and were eager to try it out. The un-conference and conference session on partitioning attracted a large crowd. One of the frequently asked questions centred on whether a user can change partition bounds of an existing partition. This bears exploring, and an answer.

Implicit to the question is the use of the ALTER TABLE command. For those who are new to PostgreSQL or to PostgreSQL partitioning, partitions are tables in PostgreSQL. Database administrators use the CREATE TABLE command to create partitions using the PARTITION OF clause. Essentially, users then expect an ALTER TABLE subcommand that allows a change to the partition bounds e.g. ALTER TABLE ... ALTER FOR VALUES ... or a similar kind of command. But, there's no such ALTER TABLE subcommand in PostgreSQL v10. We may add it in the future versions, but we have not seen any such proposal yet.

DBAs planning to use partitioning features to be  introduced in PostgreSQL v10 should not do so lightly. First, bad partitioning is worse than no partitioning. It's critical to choose partitioning keys, strategy, and ranges/lists after significant thought and testing. Second, there are many missing functionalities in v10 partitioning like SPLIT, MERGE. Like many other major features in PostgreSQL, partitioning will take a few (or possibly, just a couple) releases to be functionally complete or close to complete. However, the above functional deficiency is not hard to overcome, and what follows is how to do it.

Let's create a partitioned table with three partitions:

CREATE TABLE t1 (a int, b int) PARTITION BY RANGE(a);
CREATE TABLE t1p1 PARTITION OF t1 FOR VALUES FROM (0) TO (100);
CREATE TABLE t1p2 PARTITION OF t1 FOR VALUES FROM (200) TO (300);
CREATE TABLE t1p3 PARTITION OF t1 FOR VALUES FROM (300) TO (400);

Let'
[...]

Release 5 of the PostgreSQL Buildfarm Client has been released and can be downloaded from https://buildfarm.postgresql.org/downloads/releases/build-farm-5.tgz

In a similar move to PostgreSQL version numbering, with this release we move to a one part numbering system.

In addition to a number of bug fixes and very minor changes, this release has the following features / changes:

  • Cross-version pg_upgrade module is no longer experimental – see below
  • TAP tests now run the “check” target, but in most cases redundant installs are avoided
  • Improved and expanded TAP test running on MSVC animals – these now run the same tests as other animals
  • Automatic garbage collection of git repositories, once a week by default. This should improve the speed of git operations, especially on older git releases.
  • Improve logging in SEpgsql module
  • Revert 4.19 setting of TZ by default. If you want it set then do so in the config file. Doing so can speed up initdb. Setting it by default caused too many anomalies.
  • The environment setting BF_LOG_TIME (settable in the config file) causes command output to be prefixed with a timestamp on non-MSVC platforms if /usr/bin/ts is present.

 

The cross-version upgrade module (TestUpgradeXVersion) has long been an experimental module. It has been refactored and set to report its results properly to the server, so it’s now ready for wider use. Its use requires quite a lot of disk space, so it’s not configured by default. You need to have at least 5Gb of available space in order to use it. Normally in a buildfarm run all the build products are removed. This module saves them in a special area so that they are available for testing against different branches. Each branch is tested against itself and any earlier branches that the module has saved. So currently for the HEAD (i.e. master) branch it can test upgrades from 9.2, 9.3, 9.4, 9.5 and 9.6 as well as itself. The tests cover all the databases in the install, such as each contrib database, not just the standard regression database.

The module has a heur

[...]
Posted by Dimitri Fontaine on 2017-06-13 at 11:47

Sometimes you need to dive in an existing data set that you know very little about. Let’s say we’ve been lucky to have had a high level description of the business case covered by a database, and then access to it. Our next step is figuring out data organisation, content and quality. Our tool box: the world’s most advanced open source database, PostgreSQL, and its Structured Query Language, SQL.

Some years ago, I prophesized that PostgreSQL would win in the end, and that we would be able to tell because there would be “Microsoft PostgreSQL”.

I am happy to report that has now happened. Microsoft PostgreSQL database service on Azure was announced recently.

Of course, that was just a metaphor for “a.n.other big company”, I’m not signalling MS as requiring special attention here, in fact they have been the last to do this.

That means PostgreSQL database services are now available from all of the main cloud service providers: Amazon, Google, IBM and Microsoft. Yay! We have achieved 100% breakthrough in terms of cloud adoption – everyone recognizes now that PostgreSQL is big and adoption is accelerating.

PostgreSQL is now used in UK and US governments, see UK (http://www.computing.co.uk/ctg/news/2347067/met-office-selects-2ndquadrant-to-help-with-data-migration-to-open-source) or US Dept of Defense (http://iasecontent.disa.mil/stigs/pdf/U_PostgreSQL_9-x_STIG_V1_Release_Memo.pdf), so it looks like the trend is the same in both industry and government, and across all parts of the world.

Hmmm, only issue at present is the support of the community. None of the biggest cloud players contribute features to open source, nor do they report bugs and definitely don’t fix them. PostgreSQL community releases four maintenance releases per yer. So if you want support for open source PostgreSQL with an SLA less than 90 days, you’ll still need to come to one of the service providers listed here: https://www.postgresql.org/support/professional_support/.

Now some will call that blatant and mercenary advertising. But it seems to me to be necessary to point out who’s providing support and who is merely giving you access to servers. If you buy a car and it comes with no warranty or support, that’s a fairly important thing for car users to know, so I figure it works the same way with databases.

The June meeting will be held at 18:00 EST on Tues, the 27th. Once again, we will be holding the meeting in the community space at CoverMyMeds. Please RSVP on MeetUp so we have an idea on the amount of food needed.

Topic

Phil Vacca, of Crunchy Data fame, joins us this month to talk about containerized PostgreSQL!

The talk excerpt:

Containerization has captured the interest of developers, engineers, and business people everywhere. While BSD Jails and cgroups are established technologies in the open source world, it is the Docker project that has grabbed recent attention. Terms like stateless applications and microservices are now almost as ubiquitous as cloud computing, or virtualization before it. It is natural to wonder what, if anything, Docker offers to PostgreSQL deployment and administration.

Crunchy Data Solutions has built one of the best recognized Open Source PostgreSQL on Docker suites (https://github.com/CrunchyData/crunchy-containers), containing everything you need to run a PostgreSQL database solution - complete with replication, backups, failover, and appropriate monitoring - on Docker, Kubernetes, or OpenShift.

No specific knowledge of Docker containers is assumed for this talk. We will briefly discuss where the technology comes from, and how to get up and running with Docker; before diving into the really gory bits like how a Dockerfile is written, and where all the data is actually stored. Before we’re through we’ll see how quickly new nodes can be brought online with containers and streaming replication, and we’ll see what we can learn about failover from other highly distributed systems. We’ll even talk through logging and monitoring strategies.

Lastly we’ll discuss what all of this work really does for us as technologists. We’ll get an overview of the many Platform as a Service offerings that leverage containers, and try to predict some things about where application and data center design are going. Hopefully, containerization inspires us to explore the ever growing feature set of our

[...]

Oh, how I love the title of this 2014 Slashdot request, "Which NoSQL Database For New Project?" The user already knows the type of clients (iPhones and Android phones) and the middleware (PHP/Symfony or Ruby/Rails) and then comes the zinger, "I would like to start with a NoSQL solution for scaling …." In addition to the question of whether Ruby on Rails is a scalable solution, the replies are illuminating, and even funny, e.g. "I'm working on a new independent project. It will soon become the new Facebook, and I'll be billionaire next quarter. The only problem is that I don't know which luxury yacht to buy with all this money."

OK, now on to the serious replies, which are many, seem to come from seasoned SQL and NoSQL veterans and all seem to fall under the heading of premature optimization and scaling:

* NoSQL solutions can be ridiculously fast and scale beautifully over billions of rows. Under a billion rows, though, and they're just different from normal databases in various arguably-broken ways. By the time you need a NoSQL database, you'll be successful enough to have a well-organized team to manage the transition to a different backend. For a new project, use a RDBMS, and enjoy the ample documentation and resources available.

Continue Reading »

Posted by Luca Ferrari in ITPUG on 2017-06-11 at 06:54
There is a grant request aiming at revamping blogs.perl.org.
I have to admit that blogs.perl.org is in a bad shape, and in fact I do not use it anymore for my personal contents since the well known
login issues.
Well, the important part about the grant request, at least with regard to the PostgreSQL community, is that…surpise! The new platform will store content on a PostgreSQL backend:

[…]
will be written on top of Dancer2,
DBIx::Class, and DBI,
with a PostgreSQL database
imported from the existing
[…]


A great news for two of my favourite Open Source projects (Perl and PostgreSQL) and a great wat to spread the word thru our own content!
Posted by Vasilis Ventirozos in OmniTI on 2017-06-10 at 10:55
I have mentioned in previous posts that in my 4 years with OmniTI, we've tackled a lot of migrations. Most of them are usually the "typical" procedure. The methodology we use is more or less explained here. Last week we had a usecase of an kind of "unusual" upgrade, a 9.2 compiled with 
"--disable-integer-datetimes" meaning that all datetimes were represented as floating point internally, something that was the default at up to 8.3. This changed at (i think) 8.4 where datetimes were represented as int64 which offers more precision. 
The requirement was to migrate the database to a new one that will use integer datetimes with the minimum possible downtime. Obviously a direct upgrade wouldn't work and pg_dump / restore was not an option so we decided to approach and test this scenario differently.

The general idea is the following :

Upgrade to a 9.6 that was compiled with "--disable-integer-datetimes" and then using something like pglogical or mimeo to replicate to another 9.6 that would use integer datetimes. For this, i used 2 containers and pagila test database to make this simulation as much realistic as i could. In this post i will describe the i steps I followed.

Installed both 9.2 and 9.6 on the same box :

9.2.21 with the following options :
./configure --prefix=/home/postgres/pgsql92/ --disable-integer-datetimes
make -j 8 world
sudo make install-world

9.6.2 with the following options :
./configure --prefix=/home/postgres/pgsql96/ --disable-integer-datetimes
make -j 8 world
sudo make install-world

initiated a new cluster and started 9.2, loaded pagila testdb (schema and data), started the database.
From now on this will act like my production database.

downloaded and installed pglogical 2.0.1 9.6 :
wget -c http://packages.2ndquadrant.com/pglogical/tarballs/pglogical-2.0.1.tar.bz2
uncompress :
tar jxfv pglogical-2.0.1.tar.bz2
compile and install :
make USE_PGXS=1 clean all
sudo PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/postgres/pgsql96/bin make USE_PGXS=1 install

Note: i instal[...]

There’s a lot of mystery-cult information floating around out there about what to set shared_buffers to in PostgreSQL. Most of it is, at best, overthinking a fairly simple setting.

You can get 90% of the way there, on 95%+ of PostgreSQL systems, just by setting it to 25% of total system memory, to a maximum of 32GB. (It used to be 8GB, and then 16GB; as processors get faster, the number tends to go up.)

In particular, the difference between 24GB and 32GB, for example, will almost certainly be unmeasurable, on a system with 128GB or more memory.

It can be of benefit to lower shared_buffers on systems that have a high write rate, to reduce the total amount of data flushed on one checkpoint. I don’t have analytic values for that (shame on me!), but I’ve seen benefits by reducing it to as low as 2GB on a very high write-rate system.

But don’t agonize.

In a February blog post I talked about the mismatch between what people expect from Postgres in terms of hints, and what exists. In this blog post I would like to cover the more general case of when people should expect feature parity with their previous database, and when such expectations are unreasonable.

First, imagine if every database had the features of every other database — that would be great for compatibility but terrible for usability. You would have so many ways of doing something, with slightly different behaviors, that development and administration would be much more difficult. Of course, Perl has its "There's more than one way to do it," but at least that allows tradeoffs between clarity and conciseness, and allows different programming styles. Having all features in every database would have few of those benefits. Also consider that some features are mutually exclusive, so this would be impossible. Therefore, we have to make tradeoffs in the features each database system supports.

Let's think of Postgres compatibility at three levels: SQL, tuning, and monitoring. At the SQL level, you should expect parity between Postgres and your previous database. The syntax might be different, but all the capabilities should be similar, e.g. if your application used save-points with the previous database, Postgres should support that, and in almost all cases, it does.

Continue Reading »

Posted by Vasilis Ventirozos in OmniTI on 2017-06-09 at 13:39
I've been in situations where i need a slave db to replay a lot of wal files fast, and by a lot i mean tens of thousands. This could happen because of a reporting database refreshing or simply because a slave was down for an extended period of time. It's known that lowering shared_buffers speeds up wal replay for obvious reasons, but by how much ?

I did a benchmark on an old server and the results are interesting :

With 32GB of shared buffers and with 6390Mb of wals (1840 wal files)
it took 1408 seconds to complete the replay.

With 64MB of shared buffers and with 6510Mb of wals (1920 wal files)
it took 1132 seconds to complete the replay.

My test was done by stopping the slave, inserting 50 mil rows to a test table, wait for the wal transfer to complete, then stop the master and start the slave and watch OmniPITR logs.

The performance gain in wal replay was about 20% in postgres 10beta1 which doesn't sound bad, especially in times of need.


Thanks for reading
-- Vasilis
Posted by Shaun M. Thomas in 2ndQuadrant on 2017-06-09 at 12:30

With the christening of PG Phriday at its new location, it’s only fitting that we revisit a frequent boon and bane of organized storage: partitioning. When volume and scale reach a certain point that the very laws of physics become a barrier to efficiency, partitions are a crucial tool to escape the otherwise inevitable pile of unmaintainable bits. It’s not feasible to reindex a multi-terabyte table with tens or hundreds of billions of rows. It’s excruciating to excise portions for long-term archival, and all but impossible to perform anything but the most cursory maintenance in general. Huge tables are a partition advocate’s dream.

But until fairly recently, the implementation in Postgres has been… less than savory. While nothing is perfect, Postgres has never actually offered partitioning. That may sound like a bold statement, but consider how the functionality is cobbled together. Let’s build four partitions for 10-million rows worth of data using the “old” approach:

CREATE SCHEMA old_part;
SET search_path TO old_part;

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL
);

CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

CREATE TABLE sensor_log_part_2014 (
    LIKE sensor_log INCLUDING INDEXES,
    CHECK (reading_date >= '2014-01-01' AND
           reading_date < '2015-01-01')    
) INHERITS (sensor_log);

CREATE TABLE sensor_log_part_2015 (
    LIKE sensor_log INCLUDING INDEXES,
    CHECK (reading_date >= '2015-01-01' AND
           reading_date < '2016-01-01')    
) INHERITS (sensor_log);

CREATE TABLE sensor_log_part_2016 (
    LIKE sensor_log INCLUDING INDEXES,
    CHECK (reading_date >= '2016-01-01' AND
           reading_date < '2017-01-01')    
) INHERITS (sensor_log);

CREATE TABLE sensor_log_part_2017 (
    LIKE sensor_log INCLUDING INDEXES,
    CHECK (reading_date >= '2017-01-01' AND
           reading_date
[...]
Posted by Dimitri Fontaine on 2017-06-08 at 11:23

Kris Jenkins cooked up a very nice way to embed SQL in your code: YeSQL for Clojure. The main idea is that you should be writing your SQL queries in .sql files in your code repository and maintain them there.

The idea is very good and it is now possible to find alternative implementations of the Clojure yesql library in other languages. Today, we are going to have a look at one of them for the python programming language: anosql.