In our previous article we went through describing what retention policies are and how they can be enforced on your PostgreSQL server backups with Barman 1.2. In this post, we will go through the configuration aspects.
For the sake of simplicity, we assume a typical scenario which involves taking full backups once a week through the “barman backup” command. Suppose you want to automatically keep the latest 4 backups and let Barman automatically delete the old ones (obsolete).
The main configuration option for retention policies in Barman is “retention_policy” which can be defined both at global or server level. If you want all your servers by default to keep the last 4 periodical backups, you need to add in the general section of Barman’s configuration file the following line:
[barman] ... // General settings retention_policy: REDUNDANCY 4
When the next “barman cron” command is executed (every minute if you installed Barman using RPMs or Debian/Ubuntu packages), Barman checks for the number of available full periodical backups for every server, order them in descending chronological order (from the most recent to the oldest one) and deletes backups from the 5th position onwards.
In case you have several servers backed up on the same Barman host and you want to differentiate the retention policy for a specific server, you can simply edit that server configuration section (or file, see “Managing the backup of several PostgreSQL servers with Barman“) and define a different setting:
[malcolm] description = Malcolm Rocks ssh_command = ssh malcolm conninfo = host=malcolm port=5432 user=postgres dbname=postgres retention_policy: REDUNDANCY 8
However, Barman allows systems administrators to manage retention policies based on time, in terms of recovery window and point of recoverability. For example, you can set another server to allow to recover at any point in time in the last 3 months:
[angus] description = Angus Rocks ssh_command = ssh angus conninfo = host=angus port=5432 user=postgres dbname=postgres retention
So, here’s what I want to say today:
alembic revision -m "bug XXXXXX Add a new table" --autogenerate
The most difficult thing to deal with so far are the many User Defined Functions that we use in Socorro. This isn’t something that any migration tools I tested deal well with.
Happy to answer questions! And I’ll see about making a longer talk about this transition soon.
Do you use PostgreSQL and truly believe it’s “the world’s most advanced open source database” and that its upcoming 9.3 release will make it even more awesome?
Do you also use Python and believe it’s “an easy to learn, powerful programming language” with “elegant syntax” that makes it an ideal language for developing applications and tools around PostgreSQL, such as Pyrseas?
We have also been requested to add the capability to load and maintain “static data” (relatively small, unchanging tables) as part of yamltodb, so that it can be integrated more easily into database version control workflows.
And for the next release, Pyrseas 0.7, we’d like to include the first version of the database augmentation tool which will support declarative implementation of business logic in the database–starting off with audit trail columns. Some work has been done on this already, but it needs integration with the current code and tests.
Or perhaps coding is not your forte, but you’re really good at explaining and documenting technical “stuff”. Then you could give us a hand with revamping the docs, maybe writing a tutorial so that users have a smooth ride using our tools.
Or maybe you have your own ideas as to how improve the PostgreSQL version control experience. We’d love to hear those too.
initdb [OPTION]... [DATADIR]Now, any data corruption found will be notified as below:
-k, --data-checksums use data page checksums
initdb -D data_directory -k
postgres=# select * from corruption_test;In earlier version,just an error message.
WARNING: page verification failed, calculated checksum 63023 but expected 48009
ERROR: invalid page in block 0 of relation base/12896/16409
postgres=# select * from corruption_test where id=1;That's cool right....
ERROR: invalid page header in block 0 of relation base/12870/18192
$ export PGDATA=/usr/local/pg93beta/dataSome points on Disk page checksums:
Data page checksum version: 1
The Call for Papers for DjangoCon US 2013 is now open.
This option would be incompatible with the tablekeyset and tablekeyprefix options. If given, the key won't be looked up at all. We would simply use the given key and return the corresponding list of values. That would make selecting from such a table faster - possibly lots faster. For scalars, sets and lists, the table would have one column. In the case of a scalar there would only be one row. For zsets, it would have two columns - one for the value and one for the score. Finally, for hashes it would have two, one for the property and one for the value.CREATE FOREIGN TABLE hugeset (value text)
OPTIONS (tabletype 'list', singleton_key 'myhugelist');
"I love Mongo's HA story. Out of the box I can build a 3-node Mongo cluster with a full replica set. I can add nodes, I can fail over, without losing data."Wouldn't it be nice if we could say the same thing about Postgres? But we can't.
9,Karnataka,कर्नाटकTable to import data:
create table states(state_code int, state_name char(30), state_in_hindi text);Error:
postgres=# copy test from 'c:/Pgfile/state_data.txt' with delimiter ',' CSV;To fix, I have used a tool "bomremover.exe" to remove leading characters from a file as its on windows, if its on linux, then there are many tips & tricks available on net to wipe BOM from a utf-8 format file.
ERROR: invalid input syntax for integer: "ï»¿9"
CONTEXT: COPY test, line 1, column state_code: "ï»¿9"
Eg:-After running bomremover.exe on file, re-run COPY command which will succeed to import data.
C:\Pgfile>bomremover.exe . *
Added '.\state_data.txt' to processing list.
Press enter to process all files in the list. (1 files in total)
Processing file '.\state_data.txt'...
Finished. Press Enter to Exit
state_code | state_name | State_name_in_hindi
9 | Karnataka | αñòαñ░αÑ<8d>αñ¿αñ╛αñƒαñò
10 | Kerala | αñòαÑçαñ░αñ│αñ╛
A week after the release of PgLife, the site is averaging thirty active users. (I define an active user as an IP address that has viewed the site for at least five minutes during the past hour.) I consider that a success. Since the release of PgLife, I have increased the content update interval and added an About page explaining the site's purpose, which also includes the active user count.
The site uses AJAX, Perl, Procmail rules, and Apache to collect and deliver dynamic content. Recent improvements in the Postgres mailing list archive feature set have made linking to emails much simpler.
PgLife was my first attempt at a dynamic website, and I learned a few things. First, I learned the value of having an alert file that can force a browser reload to push fixes and improvements to the browser. Second, I used the same file to allow pushing of news alerts to users, e.g. 9.3 Beta1. Third, I learned the importance of controlling browser and server caching and revalidation when using dynamic content.
Now that PostgreSQL 9.3 beta1 has been released we've started to jump start our experimentation by compiling our favorite extensions. First on the list is PL/V8 js.
This was compiled against 9.3beta1 for 64-bit and 32-bit and plv8 version 1.4.0. We briefly tried with the EDB windows builds which we downloaded from: http://www.enterprisedb.com/products-services-training/pgbindownload and seems to work fine.
We hope windows users find these useful.
Original images from Flickr user jenniferwilliams
One of our clients, for various historical reasons, runs both MySQL and PostgreSQL to support their website. Information for user login lives in one database, but their customer activity lives in the other. The eventual plan is to consolidate these databases, but thus far, other concerns have been more pressing. So when they needed a report combining user account information and customer activity, the involvement of two separate databases became a significant complicating factor.
In similar situations in the past, using earlier versions of PostgreSQL, we've written scripts to pull data from MySQL and dump it into PostgreSQL. This works well enough, but we've updated PostgreSQL fairly recently, and can use the SQL/MED features added in version 9.1. SQL/MED ("MED" stands for "Management of External Data") is a decade-old standard designed to allow databases to make external data sources, such as text files, web services, and even other databases look like normal database tables, and access them with the usual SQL commands. PostgreSQL has supported some of the SQL/MED standard since version 9.1, with a feature called Foreign Data Wrappers, and among other things, it means we can now access MySQL through PostgreSQL seamlessly.
The first step is to install the right software, called mysql_fdw. It comes to us via Dave Page, PostgreSQL core team member and contributor to many projects. It's worth noting Dave's warning that he considers this experimental code. For our purposes it works fine, but as will be seen in this post, we didn't push it too hard. We opted to download the source and build it, but installing using pgxn works as well:
$ env USE_PGXS=1 pgxnclient install mysql_fdw INFO: best version: mysql_fdw 1.0.1 INFO: saving /tmp/tmpjrznTj/mysql_fdw-1.0.1.zip INFO: unpacking: /tmp/tmpjrznTj/mysql_fdw-1.0.1.zip INFO: building extension gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wformat-security -fno-strict-al
Nobody likes XML, except masochists and Microsoft consultants
who charge by the hour.
Sometimes you are forced to deal with XML anyway, such as parsing the response from external APIs.
Parsing XML is per se a nasty business to begin with, but in this particular example, the ugliness set new records.
The only “solution” I came up with is too ugly for production use, but the alternatives were even uglier, so I had no option.
I hope there is someone out there reading this who can present a proper solution to the problem.
Let’s say you have this XML:
<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"> <ns1:foo> <ns2:bar>baz</ns2:bar> </ns1:foo> </testxml>
xpath() you extract the content of
SELECT xpath( '/testxml/ns1:foo', '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml, ARRAY[ ['ns1','http://www.example.com'], ['ns2','http://www.example2.com'] ] ); -- Result: <ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo>
The returned XML is not valid since its missing the xmlns
but the PostgreSQL XML data type doesn’t complain, which is OK I guess,
a bit of quirks mode perhaps?
Because of the missing xmlns, it’s impossible to make use of
this XML fragment returned.
You cannot extract any subsequent sub-elements in it using XPath.
For instance, this won’t work:
SELECT xpath( '/ns1:foo/ns2:bar/text()', (xpath( '/testxml/ns1:foo', '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml, ARRAY[ ['ns1','http://www.example.com'], ['ns2','http://www.example2.com'] ] )) ); -- Error: -- ERROR: could not create XPath object -- DETAIL: namespace error : Namespace prefix ns1 on foo is not defined -- <ns1:foo> -- ^ -- namespace error : Namespace prefix ns2 on bar is not defined -- <ns2:ba
PostGIS 2.1.0 beta2 is out. Details on what's new in it are in official news release: http://postgis.net/2013/05/11/postgis-2-1-0beta2. This is the first version of PostGIS to work with PostgreSQL 9.3, so if you are planning to experiment with PostgreSQL 9.3 coming out soon, use this one. Also check out the documentation in new ePUB offering format if you have an ereader and let us know how it looks. It seems to vary alot depending on what ePub reader used.
For windows users, we've got binary builds available compiled against PostgreSQL 9.3beta1 (and also available for 9.2 9x32,64) and 9.0,9.1 (x64). Details on windows PostGIS downloads page: http://postgis.net/windows_downloads. It does not yet have the new Advanced 3D offering (provided by SFCGAL https://github.com/Oslandia/SFCGAL), but we hope to have that compiled and packaged with the binaries before release time.
We don't really want the expenses grouped by the person's properties. We just put that in because the parser complains if we don't. And if people turns out to be a view which joins a couple of tables, we probably can't leave it out either. This can increase the amount of sorting that the GROUP BY requires, which can sometime have dramatic effects on performance. But even worse, there are cases where this can actually cause the query to be unrunnable. One such case is if properties is a JSON column.SELECT a.id as a_id, a.properties, sum(b.amount) as expenses
FROM people a
JOIN expenses b on a.id = b.person_id
GROUP BY a.id, a.properties
This might look a bit silly. We're adding in an extra join to people that we shouldn't need. But it turns out in my experience that this actually often works pretty well, and what you pay by way of the extra join is often paid for by the fact that you're simplifying the GROUP BY, and that it is processing smaller rows, uncluttered by the extra coluWITH exp as
SELECT a.id as a_id, sum(b.amount) as expenses
FROM people a
JOIN expenses b on a.id = b.person_id
GROUP BY a.id
SELECT exp.*, p.properties
JOIN people p ON p.id = exp.a_id
Let’s face it, the pg_hba.conf file is a pain in the ass to use regularly. Sure, reloading the database will cause it to re-read this file, but with a lot of active users and frequent changes, this isn’t really tenable.
Luckily lurking deep within its bowels, PostgreSQL has a little-known feature that can easily be overlooked because it’s so humbly stated. Here’s the manual entry for pg_hba.conf for the user section:
Specifies which database user name(s) this record matches. The value all specifies that it matches all users. Otherwise, this is either the name of a specific database user, or a group name preceded by +. (Recall that there is no real distinction between users and groups in PostgreSQL; a + mark really means “match any of the roles that are directly or indirectly members of this role”, while a name without a + mark matches only that specific role.) Multiple user names can be supplied by separating them with commas. A separate file containing user names can be specified by preceding the file name with @.
The implications of this are staggering and should be shouted from the rooftops frequently and with much fanfare. But what part of that paragraph is the feature that has me raving about its awesomeness? The + decorator for a specified role.
Initially, it might occur to a DBA to simply take advantage of this ability to use existing roles and segregate access by implementing a few well-placed group lines into the file. Say we wanted to allow all DB developers to connect, and our local subnet had a range for desktop systems. We could do this:
# TYPE DATABASE USER CIDR-ADDRESS METHOD host all +developer 10.10.0.0/16 md5
And viola! Instead of granting access to each individual person, anyone in the developer group could connect provided they had a password. Neat, eh?
Ah, but it goes much deeper than that.
What happens when we apply this to the entire file, and completely purge all individual user entries entirely? Even for automated or batch systems? We get the
I found myself with a little unexpected time at work recently, and since we use Oracle (for a few more months), I decided to port Sqitch. Last night, I released v0.970 with full support for Oracle. I did the development against an 11.2 VirtualBox VM, though I think it should work on 10g, as well.