PostgreSQL
The world's most advanced open source database
Top posters
Number of posts in the past month
Top teams
Number of posts in the past month
Feeds
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.

In this article, I want to introduce the ICU support in PostgreSQL, which I have worked on for PostgreSQL version 10, to appear later this year.

Sorting

Sorting is an important functionality of a database system. First, users generally want to see data sorted. Any query result that contains more than one row and is destined for end-user consumption will probably want to be sorted, just for a better user experience. Second, a lot of the internal functionality of a database system depends on sorting data or having sorted data available. B-tree indexes are an obvious example. BRIN indexes have knowledge of order. Range partitioning has to compare values. Merge joins depend on sorted input. The idea that is common to these different techniques is that, roughly speaking, if you have sorted data and you know what you are looking for, it makes it much faster to locate the place where it should be found.

There are two important aspects to sorting. One is the sorting algorithm. This is a standard topic in computer science, and a lot of work has gone into PostgreSQL over the years to refine the various sorting algorithms and methods, but that’s not what I will write about. The other is deciding in what order things should be, which is what we call collation. In many cases, that choice is obvious. 1 comes before 2. FALSE comes before TRUE … well, someone just arbitrarily decided that one. A usually comes before B. But when it comes to natural language text, things get interesting. There are many different ways to order text, and the actual methods to collate text strings are more complicated than might be apparent. Different languages prefer different sort orders, but even within a language, there can be variations for different applications. And there are details to worry about, such as what to do about whitespace, punctuation, case differences, diacritic marks, and so on. Look up the Unicode Collation Algorithm for more insight into this.

Before the ICU feature was committed, all this functionality what facilitated by the

[...]

So PostgreSQL 10beta1 came out recently as Holly mentioned. When first mention of beta hits, things start getting serious for me. I have to make sure that PostGIS compiles against said distribution to make sure eager testers aren't held back.

As with other releases, PostGIS didn't compile against the new PostgreSQL version without some nurturing. We've still got one regress failure, but at least PostGIS 2.4 now compiles cleanly against PostgreSQL 10beta1. I'm hoping that we can release PostGIS 2.4.0 just in time for PostgreSQL 10 planned release in September so I don't have to backport PostgreSQL 10 patches I made to lower PostGIS versions.

For PostGIS 2.4 the main focus will be cleaning up the parallel work so that all aggregate functions can enjoy use of parallel optimization. This is even more important with PostgreSQL 10 now that more kinds of queries can benefit from parallelization work. I'm also hoping to focus more energy on the raster side of things.


Continue reading "PostGIS 2.4.0, Code Sprints and other extensions to try with PostgreSQL 10 beta1"
Nikita Glukhov recently identified and fixed a bug in jsquery (https://github.com/postgrespro/jsquery). Indexing of scalars was broken, please upgrade jsquery !

The bug could be demonstrated in this example (query returns different results for seqscan and when using index):


create table test(js jsonb);
insert into test values ('1');
select count(*) from test where js @@ '$ = 1';
 count
-------
     1
(1 row)

create index test_path_value_idx on test using gin(js jsonb_path_value_ops);
CREATE INDEX
set enable_seqscan = off;
SET
select count(*) from test where js @@ '$ = 1';
 count
-------
     0
(1 row)

explain select count(*) from test where js @@ '$ = 1';
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=12.02..12.03 rows=1 width=8)
   ->  Bitmap Heap Scan on test  (cost=8.00..12.01 rows=1 width=0)
         Recheck Cond: (js @@ '$ = 1'::jsquery)
         ->  Bitmap Index Scan on test_path_value_idx  (cost=0.00..8.00 rows=1 width=0)
               Index Cond: (js @@ '$ = 1'::jsquery)
(5 rows)

This is the story of how I applied basic performance analysis techniques to find a small change that resulted in a 10x improvement in CPU use for our Postgres cluster and will save Heap millions of dollars over the next year. Indexing Data for Customer Analytics Heap is a customer analytics tool that automatically captures […]

The post How Basic Performance Analysis Saved Us Millions appeared first on Heap Blog.

Some (long) time ago, someone on irc suggested that I add option to keep track of optimizations of queries. Sorry, I forgot your name, and the mails disappeared in some crash. Anyway – right now, when you are on some plan page, you can press “Add optimization" button, and you will be redirected to index […]
Posted by Bruce Momjian in EnterpriseDB on 2017-05-19 at 14:30

Effective_io_concurrency controls how many concurrent requests can be made by bitmap heap scans. The default is 1, which means no concurrency. For magnetic disks, this can be increased, perhaps to 8, but not much higher because of the physical limitations of magnetic disks. However, for SSDs, this can be increased dramatically, perhaps in the hundreds.

This is documented in Postgres 9.6, as well as the ability to set effective_io_concurrency at the tablespace level. I have updated my Database Hardware Selection Guidelines presentation to mention this too.

Posted by Holly Orr in OpenSCG on 2017-05-19 at 13:52

Hey, have you heard?

PostgreSQL 10 Beta is road ready!

Kick the tires and take the cool new features and improvements out for a spin. Highlights include:

  • Declarative Partitioning
  • Logical Replication
  • SCRAM Authentication

See the full list of features here.

Take the easy route

Never used the BigSQL PostgreSQL Distributions before? Follow these 5 easy steps. Already have BigSQL pgc installed? Jump to step #3.

  1. Read this short introduction to BigSQL’s package and update manager (pgc).

  2. Install pgc via command line to a sandbox:

    MAC / Linux:

    python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)" 
    
    

    Windows:

    @powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"
    
    
  3. Get into the product home directory by navigating via command line to the bigsql directory:

    cd bigsql
    
    
  4. Using the pgc command line tool, run the update command to get the latest distribution (Windows users, don’t prefix the pgc command with ./ as shown in the examples below):

    ./pgc update
    
    
  5. Then run the install, init, and start commands on pg10:

    ./pgc install pg10
    ./pgc init pg10
    ./pgc start pg10
    
    

That’s it! You’re ready to go. Got questions or suggestions? Get in touch with us at BigSQL Feedback.

Todays blogpost is old news but it's not very well known and it has to do with how to add an old master after a slave has been promoted. Fujii Masao explained the situation in his patch back in the day.

So in todays post i will be demonstrating a combination of replication slots for retaining the writes on a promoted slave and how to re-introduce an old master to the replication.

Say that we have a master-slave setup, stop the master gracefully and create and activate a replica slot on the slave :

monkey=# SELECT * FROM pg_create_physical_replication_slot('this_is_a_replica_slot');
       slot_name        | xlog_position
------------------------+---------------
 this_is_a_replica_slot |
(1 row)

postgres@bf9823730feb:~$ pg_receivexlog -D . -S this_is_a_replica_slot -v -h 10.0.0.3 -U repuser
pg_receivexlog: starting log streaming at 0/4000000 (timeline 1) ^C
pg_receivexlog: received interrupt signal, exiting
pg_receivexlog: not renaming "000000010000000000000004.partial", segment is not complete
pg_receivexlog: received interrupt signal, exiting


This way all the changes after the slave promotion will be retained. Now , lets promote the slave and check the replication slot status.

postgres@bf9823730feb:~$ pg_ctl promote
server promoting

postgres@bf9823730feb:~$ psql -c "SELECT slot_name, database,active,  pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots;" postgres
       slot_name        | database | active | retained_bytes
------------------------+----------+--------+----------------
 this_is_a_replica_slot |          | f      |           9056

As you can see , the new master is now retaining all writes. Now, to the old master, lets make a recovery.conf that looks like this :

standby_mode = 'on'
primary_conninfo = 'user=repuser host=10.0.0.3 port=5432 application_name=a_slave'
trigger_file = '/home/postgres/pgdata/finish.recovery'
primary_slot_name = 'this_is_a_replica_slot'
recovery_target_timeline = 'latest'

where host is obviously the new master. 
I missed it completely, but on 24th of March 2017, Alvaro Herrera committed patch: Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are […]
Posted by Holly Orr in OpenSCG on 2017-05-17 at 19:23

Install in 6 easy steps

Never used the BigSQL PostgreSQL Distributions before? Follow these 5 easy steps. Already have BigSQL pgc installed? Jump to step #3.

  1. Read this short introduction to BigSQL’s package and update manager (pgc).

  2. Install pgc via command line to a sandbox:

    MAC / Linux:

    python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)" 
    
    

    Windows:

    @powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"
    
    
  3. Get into the product home directory by navigating via command line to the bigsql directory. Windows users don’t prefix the pgc command with ./ as shown in the below examples:

    cd bigsql
    
    
  4. Using the pgc command line tool, run the update command to get the lastest distribution:

    ./pgc update
    
    
  5. Then run the install, init, and start commands on pgdevops:

    ./pgc install pgdevops
    ./pgc init pgdevops
    ./pgc start pgdevops
    
    
  6. Open pgDevOps in browser:

    localhost:8051
    
    

What you get:

Easily build PostgreSQL development environments:

Eliminate wait times by giving developers the power to standup precompiled PostgreSQL versions on Windows, Linux, and Mac.

One-click PostgreSQL component installs:

In the package manager dashboard choose from an extensive list of community projects to extend your PostgreSQL capabilities.

Monitor your PostgreSQL databases:

Intuitive graphs, metrics, and one-click access to logfiles make it easy to monitor the implications of changes to code, queries, and infrastructure.

The plProfiler Console creates performance profiles of PL/pgSQL functions and stored procedures and increase efficiency of your code.

The pgBadger Console gives you advanced log analysis features.

Want to learn more about the DevOps philosphy behind BigSQL’s pgDevOps development? Read DevOps: Can’t we just all get along?

Got questions or suggestions? Get in touch with us at BigSQL Feedback.

Posted by Bruce Momjian in EnterpriseDB on 2017-05-17 at 18:00

I have reorganized my twenty active presentations to be easier to find. I was getting confused, so I assume others were as well. The new categories seem clearer and more closely match the categories I use for Postgres blog entries. I have reordered items within categories. I have also indicated more presentations that can be given in pairs.

Also, this web page now defaults to using https/TLS.

PGConf US and Austin PUG organized a PGConf Mini last night. It was three presentations over the evening. We had good attendance and excellent community participation. The highlight of the evening was PGConf US Co-Chair Jim Molgendski's presentation, "Top 10 Mistakes When Migrating from Oracle to PostgreSQL".

We tried something new this time around. We utilized Twitter's Periscope capability to live stream his presentation. It wasn't perfect but it was our first time and we are looking forward to utilizing the platform more in the future.

Click here to watch "To Mistakes When Migrating From Oracle to PostgreSQL", and then submit a presentation to one of our upcoming conferences:


PGConf US 2017 - 2018 
  • Diamond Sponsor: Amazon Web Services
  • Platinum Sponsor: OpenSCG

Having spent the last week with Oracle DBAs eagerly peeking towards PostgreSQL I saw that besides the usual beginner questions one of the Postgres concepts that sounded kind of familiar for Oracle folks (but still brought some confusion, as it’s not quite the same thing) are tablespaces. From higher (user) level everything is pretty similar […]

The post PostgreSQL & tablespaces – it’s not so scary … appeared first on Cybertec - The PostgreSQL Database Company.

Posted by Joshua Drake on 2017-05-16 at 17:00

PGConf US in partnership with SeaPug is pleased to announce that the call for papers for PostgreSQL Conference US Local: Seattle is now open. 


PostgreSQL Conference US Local: Seattle is taking place August 11th and 12th 2017! Call for Papers is now open and presentations can be submitted here.



Image result for aws
National Diamond Sponsor


The call for papers will be open from May 16, 2017 until July 2, 2017. Speakers will be notified of acceptance/decline no later than July 8.

The two track, two day conference is a perfect opportunity for the Vancouver, B.C., west coast, and Rocky Mountain regions to join the PostgreSQL community and increase their knowledge.
Image result for openscg
National Platinum Sponsor

Breakdown of the Conference Layout: 

  • August 11th: Training
  • August 12th: Breakout Sessions 

All selected presenters will receive free entry to the breakout sessions (trainings are extra). There are no grants for travel and accommodations. We encourage everyone to submit a talk or training to one of our very first PostgreSQL Conference US Local events and be a part of growing the PostgreSQL community.

Postgres 10 will be released in a couple of months, with its beta 1 to be out very soon, and for this release many small, incompatible changes have been introduced during its development to help with long-term maintenance. The full list of items is present in the release notes, and here are the changes that can have an impact on any applications relying of what PostgreSQL has provided up to now.

Here are the changes related to the on-disk format of the data folders.

  • pg_xlog has been renamed to pg_wal. This has primarily consequences for any application handling with backup and restore, though the change to get the update right should be straight-forward.
  • pg_clog has been renamed to pg_xact.
  • Default value for log_directory has been changed from “pg_log” to “log”. Most applications likely set this parameter to an absolute path anyway, hopefully pointing to a different partition. Or not.

The idea behind the first two items is to protect users always tempted to perform a “rm -rf *log” in a data folder if the partition where data is present gets full to free up some space. This would result in a corrupted cluster.

Some commands have changes in option names, as well as default behavior changes. Some commands are removed.

  • pg_basebackup has changed the default value of –xlog-method to “stream”, and “none” can be used to get the past behavior. It has been proven that users like being able to rely on self-contained, consistent backups. The option -x has been removed as well, being replaced by “-X fetch”.
  • pg_ctl not waits for all its subcommands to wait for completion by default. Note that if your application has relied on “start” mode leaving immediately when starting an instance to start recovery, pg_ctl would wait also until the server has reached a consistent state. Better to be careful about that. “stop” has been always using the wait mode.
  • createlang and droplang are no more. RIP.

From the system-side of things, many things are piling up:

  • Cleartext password support has been removed, trying for example to create a
[...]
Posted by Bruce Momjian in EnterpriseDB on 2017-05-15 at 15:15

Disk array controllers gained popularity in a world of magnetic disks by providing:

  • Hardware RAID support
  • Large read/write cache
  • Write reordering
  • Instant durable storage of writes using a BBU

With the popularity of SSDs, particularly those with BBUs (which are the only kind that should be used with Postgres), the usefulness of RAID controllers is now unclear:

  • Modern CPUs are very fast and can easily do RAID in software
  • Modern servers have tons of RAM
  • Write reordering is unnecessary for SSDs, which have good random write performance
  • BBUs on SSDs make RAID BBUs redundant

And RAID controllers introduce problems of their own:

Continue Reading »

Posted by Holly Orr in OpenSCG on 2017-05-15 at 13:32

bigsql_sticker_3x3_R3_option_4

pgDevOps will help your developers and operations teams work collaboratively, efficiently, and openly.

To understand the value added when you include pgDevOps in your PostgreSQL tool-chain, keep reading.

tldr;

Go straight to the tutorial to start using pgDevOps today!

Why DevOps: A brief explainer

First a little background to better understand why BigSQL built pgDevops.

conflict_blue_sm

“It’s not my machines, it’s your code!”
“It’s not my code, it’s your machines!”
– from Deploys a Day: Dev and Ops Cooperation at Flickr
(and said by everyone)

Sound familiar? This is a common interaction (distraction) between development and operations teams in IT Departments everywhere and everyday. This is because the two roles’ directives are often diametrically opposed. Devs need to respond quickly to changes. Ops must maintain stability.

DevOps: To the rescue

Insanity: doing the same thing over and over again and expecting different results.
– Albert Einstein

In 2008 Andrew Clay Shafer and Patrick Debois first discussed “Agile Infrastructure” at the Agile conference. Soon after, the term “DevOps” was coined as a moniker for the practice of building infrastructure, systems, and a culture to enable cooperation between development and operations teams.

At the Velocity Conference in 2009, John Allspaw and Paul Hammond gave their seminole presentation 10 Deploys a Day: Dev and Ops Cooperation at Flickr where they outlined the 10 key ingredients that would become central to the DevOps philosophy:

devops_logo Tools

  1. Automated infrastructure
  2. Shared version control
  3. One step build and deploy
  4. Feature flags
  5. Shared metrics
  6. IRC and IM robots

Culture

  1. Respect
  2. Trust
  3. Healthy attitude about failure
  4. Avoiding blame

Since 2009, DevOps practices have become increasingly popular as businesses require faster time to market to remain competitive.

In fact, DevOps teams increased from 16% in 2014 to 19% in 2015 to 22% in 2016.
– 2016 State of DevOps Report

DevOps ROI

At BigSQL we have seen the benefits of adopting Agile / Lean / DevOps practices. But don’t take our wo

[...]
Posted by Federico Campoli on 2017-05-13 at 14:36
Last week I announced the stable release of pg_chameleon as I felt quite confident the tool worked properly.

However the murphy law is always ready to teach us we live in an interesting universe. A couple of days after the release I noticed on a server which data were modified seldom a strange replica issue. For some reason at specific moments of the day the inserts replayed on the postgres tables failed with the primary key violation. It took me a while to reproduce the bug as this is an interesting corner case. This sequence of statemens on MySQL cause the replica break down on the version 1.0.

DROP TABLE IF EXISTS test;
CREATE TABLE test(
id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
value1 VARCHAR(45) NOT NULL,
PRIMARY KEY (id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TEMPORARY TABLE tmp_test(
id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
value1 VARCHAR(45) NOT NULL,
PRIMARY KEY (id)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into tmp_test (value1) values('blah'),('blah');
insert into test (value1) values('blah');
DROP TEMPORARY TABLE if exists tmp_test ;
This is caused by the batch not marked as closed when the query is received. The batch was closed only when the query were parsed. The statements CREATE TEMPORARY TABLE are not parsed (by design) and therefore the batch were not closed correctly. The subsequent row images generated by the inserts were read at each loop causing a replay issue because of the primary key violation.

I also took the occasion to complete the docstrings on the last library sql_util.py. Now the classes are is fully documented.

The release 1.1 is already available on pypi.

I've also created a google group for help and general discussions.

I'd really love to hear users's stories.
PostgreSQL JSQuery extension Windows binaries

The JSQuery extension is a PostgreSQL extension developed by Postgres Professional. You can get the source code and instructions for use at https://github.com/postgrespro/jsquery. JSQuery is a fairly easy compile install if you are on a Nix system. It provides more query functionality and additional index operator classes to support for JSONB than you get in built in PostgreSQL. It is supported for PostgreSQL 9.4 and above.

We've built windows binaries for PostgreSQL 64-bit 9.4, 9.5 and 9.6. The 9.4 64-bit will only install on the EDB PostgreSQL 9.4 64-bit distribution. The 9.5 and 9.6 are compatible with both PostgreSQL EDB and BigSQL distributions. We should have 32-bit versions later and will link to those here.


Continue reading "PostgreSQL JSQuery extension Windows binaries"
Posted by Bruce Momjian in EnterpriseDB on 2017-05-12 at 16:00

Having given the first presentation of my new window function talk, Postgres Window Magic, I have now put the slides online.

Posted by Tomas Vondra in 2ndQuadrant on 2017-05-12 at 13:00

For the last few months, we at 2ndQuadrant have been working on merging PostgreSQL 9.6 into Postgres-XL, which turned out to be quite challenging for various reasons, and took more time than initially planned due to several invasive upstream changes. If you’re interested, look at the official repository here (look at the “master” branch for now).

There’s still quite a bit of work to be done – merging a few remaining bits from upstream, fixing known bugs and regression failures, testing, etc. If you’re considering contributing to Postgres-XL, this is an ideal opportunity (send me an e-mail and I’ll help you with the first steps).

But overall, Postgres-XL 9.6 is clearly a major step forward in a number of important areas.

New features in Postgres-XL 9.6

So, what new features does Postgres-XL gain from the PostgreSQL 9.6 merge? I could simply point you to the upstream release notes – most of the improvements directly apply to XL 9.6, with the exception of those related to features unsupported on XL.

The main user-visible improvement in PostgreSQL 9.6 was clearly parallel query, and that also applies to Postgres-XL 9.6.

Intra-node parallelism

Before PostgreSQL 9.6, Postgres-XL was one of the ways to get parallel queries (by placing multiple Postgres-XL nodes on the same machine). Since PostgreSQL 9.6 that’s no longer necessary, but it also means Postgres-XL gains intra-node parallelism capability.

For comparison, this is what Postgres-XL 9.5 allowed you to do – distributing a query to multiple data nodes, but each data node was still subject to the “one backend per query” limit, just like plain PostgreSQL.

Thanks to the PostgreSQL 9.6 parallel query feature, Postgres-XL 9.6 can now do this:

That is, each data node can now run it’s part of the query in parallel, using the upstream parallel query infrastructure. That’s great and makes Postgres-XL much more powerful when it comes to analytical workloads.

Maintaining a fork

I mentioned this merge turned out to be more challenging than we initially expected, for a n

[...]
Posted by Joshua Drake on 2017-05-11 at 18:12
Tuesday, May 16, 2017 6:00 PM to 9:00 PM @ fibercove
1700 S Lamar Blvd, Suite 338, Austin, TX

Join us for a special presentation with PostgreSQL experts who will be in town for DataLayer 2017.

Food and refreshments will be provided, so please be sure to RSVP.

Thank you to OpenSCG for sponsoring our food and soft drinks, and fibercove for hosting us! Beer sponsor TBA.
  • The event starts at: 6 pm 
  • Networking and food/drink : 6:15 pm 
  • Announcements and updates: 6:30 pm   
Presentations:
• "Softlayer vs RDS/AWS: A price to performance perspective" - Joshua D. Drake, Command Prompt, Inc.

• "PostgreSQL on Debian and apt.postgresql.org" - Dr. Michael Meskes, CEO of credativ (http://www.credativ.com/)

• "Top 10 Mistakes When Migrating From Oracle to PostgreSQL" -- Jim Mlodgenski, CTO of OpenSCG
 
About our Speakers: 

Joshua D. Drake of Command Prompt, Inc. (https://www.commandprompt.com/), is a PostgreSQL Consultant who has consulted on PostgreSQL since Postgres95. Throughout his PostgreSQL career he has performed a variety of functions within the community including releasing an O'Reilly book on PostgreSQL, as well as being part of the sysadmins and advocacy teams He ran a biannual conference series dedicated to PostgreSQL from 2007 - 2011. He is a former Director for SPI (the non-profit for PostgreSQL.org). Drake is also a Founder of United States PostgreSQL, as well as a co-organizer and Chair of PGConf US.

Dr. Michael Meskes is President and CEO of the credativ Group, an industry leader in free software services with offices in five countries. Its Open Source Support Centers employ leading members of a number of Open-Source projects. He has been Open-Source developer for twenty years working on different Open-Source projects among which Debian and PostgreSQL are most widely known. He also has done a lot of Open-Source related presentations on all sorts of events doing a lot of Open-Source evangelism.

Jim Mlodgenski is CTO of OpenSCG, a leading enterprise open source services company, with a central focus on P[...]
Posted by Markus Winand on 2017-05-11 at 00:00

Here is the most noteworthy news since my last “Big News In Databases” post from November 2016.

A new SQL standard was released (SQL:2016)

In December 2016, ISO released a new version of the SQL standard. It introduces new features such as row pattern matching, listagg, date and time formatting, and JSON support. I've already written about listagg and made a presentation about row pattern matching.

Subscribe the modern-sql.com blog for further updates on the new standard.

Oracle Database 12.2 on premises, cloud prices double for AWS and Azure

Oracle’s cloud strategy was already topic in my previous post: Although the latest database release 12.2 was in November 2016, it was not available for download at that time—it was only available as a service in the Oracle cloud.

In January 2017, Oracle effectively doubled the license cost for the Oracle databases in Amazon’s and Microsoft’s cloud environments. And how does Microsoft react? On May 10, Microsoft announced the availability of MySQL and PostgreSQL as managed services in the Azure cloud.

Begun the cloud war has. One party opens up, another isolates itself. The marked leader watches.

In light of these developments, the release of the Oracle database 12.2 download in March is just a side note.

Database of the year

In January, DB-Engines.com (“TIOBE of databases”) published their database of the year 2016. According to their ranking method, SQL Server had the greatest gains in 2016 and is thus the database of the year 2016. Second and third place went to MySQL and PostgreSQL.

An interesting side note from their announcement: “It is the first time we see three RDBMS winning that competition, in the past we had at least two NoSQL systems among the winners.”

ACIDRain: Study highlights security risk of wrong transaction use

One of the consequences of database normalization is that transactions are needed to work safely. SQL has had transactions for a long time. The latest change in this area was the introduction of transaction isolation levels—25 years ago.

A new s

[...]
Posted by Bruce Momjian in EnterpriseDB on 2017-05-10 at 16:00

I did my first presentation covering Postgres 10 features, so my slides are now online.

Post a Comment
Posted by Craig Kerstiens in CitusData on 2017-05-10 at 14:45

There are a number of applications out there that have a high number of connections to Postgres. What’s high? That all depends on your application, but generally when you get to the few hundred connection area in Postgres you’re in the higher end. Anything in the thousands is definitely in the high territory, and even several hundred can put strain on your application. Generally a safe level for connections should be somewhere around 300-500 connections. This may seem low if you’re already running with thousands of connections, but it’s likely perfectly fine with pgBouncer taking care of the heavy lifting for you. Let’s drill into why a bit further.

Connection basics in Postgres

Each new connection to Postgres is a forked process. This process comes with its own memory allocation of roughly 10 MB with at least some load for the query. In this sense every new connection is already consuming some of those precious resources you want processing queries. For 300 database connections this is 3 GB of memory going just to managing those connections—memory which could be better used for caching your data.

When a connection isn’t just a connection

Many application frameworks such as Rails like to grab a pool of connections when they start up. This reduces the time needed to get a connection when they run a query, instead they pull one from the pool. This is good for your app’s performance as the time to get a connection isn’t always trivial. However, this does result in a bunch of connections to your database sitting idle. Recently I saw an application with 300 open connections, but only 17 active queries. How do you check active queries? You can run this to give you some insight to currently active queries:

SELECT COUNT(*) 
FROM pg_stat_activity
WHERE state <> 'idle'

The issue here is that by default every new request is going to grab a new connection, so if you have 300 requests happening at a time you have 300 connections to your database. For something like sidekiq, which processes background jobs for Ruby, you can

[...]
Posted by Pavel Golub in MicroOLAP on 2017-05-10 at 09:04

Stack Overflow provided a great tool for creating charts. This is mine for “PostgreSQL vs MySQL vs Oracle vs MSSQL”.

postgresql-mysql-oracle-mssql-sotrends

PostgreSQL vs MySQL vs Oracle vs MSSQL on Stack Overflow trends

PS Oracle and MSSQL have some additional tags depending on version. However, I’m sure we may not include them into statistics because the main tag is always used IMO.


Filed under: PostgreSQL Tagged: PostgreSQL, SQL, stats
Posted by Vasilis Ventirozos in OmniTI on 2017-05-09 at 15:08
 Postgres provides a lot of information when it comes to statistics. Only problem is that all the statistics are frozen in time the moment you query the system catalog views. Sometimes you need to see whats happening over time or in the past, this is easily solvable with a script and a crontab entry. In OmniTI we use a project called system_monitoring a lot. This is basically a perl scheduler that runs as a deamon exporting useful information from postgres (and more) to tab delimited text files for later investigation. If you have decent one liner skills this is very helpful, you can answer questions about what happened yesterday night, what was running, who locked who etc. It's very good but data analysis can be a pain in the ass sometimes, especially if you need to reimport to a db and analyze with SQL. So I decided to approach this in a different way, by storing the statistics to a schema. Thats why i wrote statz
Statz is a very simple python script that will gather all postgres statistics for a period of time in an interval, aggregate them and keep raw and aggregated data in it's own schema. The raw data that it keeps are just a now(),* from the most important statistics tables :

  • pg_stat_user_indexes
  • pg_stat_activity
  • pg_locks
  • pg_stat_user_tables
  • pg_stat_database
  • pg_stat_bgwriter
But its relatively easy to add more, even application specific data.

How to use :

statz needs a database to connect (for now its the same with the database that monitors), an interval and a total duration (interval and total duration should be in seconds).
So, something like:

./statz.py -c dbname=monkey -i 5 -t 120 

Would initialize a schema called statz (this for now is hardcoded)
if schema exists it will drop it (cascade) and recreate it.
In a loop that will last for <total time> , sleeping for <interval> it will keep output of select now(),* from the tables listed above, populating the following tables :

statz.backend_activity
statz.bgwriter_activity
statz.database_activity
statz.index_activity
statz.lock_activity
statz
[...]

Transactions are an important feature in any database and PostgreSQL has them. We want to be “ACID Compliant” and ensure that all data is read and written in a consistent state. All changes in the transaction either commit or rollback together. You never change just one table, without also changing all the other related tables together in an atomic block. Transactions are great!

Yeah. But … what about when you actually want to change just one table and roll back all the rest? Or you have to ensure that one change is made whether or not the rest of the transaction is committed? Say you need to write an audit record of an attempted change or data access by a user. Or say you are trying to update a batch job control table? What do you do then?

As a database developer, you want these sub-transactions or autonomous transactions to be processed and committed regardless of what happens to the larger surrounding transaction block. The application could open 2 connections or make 2 transactions in a row, but if the 2nd transaction or connection failed you might not get your audit record written, and you would be paying a performance penalty.

If you’re coming from an Oracle background, you may be familiar with PRAGMA AUTONOMOUS TRANSACTION – which makes a procedure execute in an autonomous transaction so any changes in the procedure are committed separately from the calling pl/SQL code. PostgreSQL does not directly support an autonomous transaction, but there are multiple ways to accomplish this same goal.

Installing pg_background

The pg_background Postgres extension originally written by Robert Haas starts a separate worker to process a SQL statement, which is a nice way to achieve an autonomous transaction. pg_background is available through the Postgres by BigSQL distribution so you can install it using the PGC command line tool:

#always make sure you have the latest list of extensions
./pgc UPDATE
./pgc install background1_pg96

Then create the extension in your database as normal:

-- create the extension needed for
[...]

One of the changes coming in PostgreSQL 10 is the ability for the CASE .. WHEN statement to return multiple rows if the expression contains a set returning function. To demonstrate the feature, we'll use the classic generate_series function:


Continue reading "CASE with set returning functions in PostgreSQL 10"


Join the PostgreSQL community in Philadelphia on July 13th and 14th 2017 for two days of fantastic PostgreSQL content. July 13th will contain trainings and July 14th will be breakout sessions. The call for papers and sponsors is also open! When submitting papers to PGConf US events, please review the presentation guidelines.

Image result for PGConf US


PGConf US Local events are designed to bring comprehensive educational content and networking opportunities to the "local" PostgreSQL community where the event is being held. They are perfect opportunities to show support for PostgreSQL, find leads and build relationships with other professionals and companies using and support PostgreSQL.











d at Huntsman Hall at The Wharton School