PostgreSQL
The world's most advanced open source database
Top posters
Top teams
Feeds
Planet
  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.
Contact
  • Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.
Posted by Peter Eisentraut on 2009-07-02 at 12:39:16

If you have downloaded PostgreSQL 8.4.0 and are wondering where so many of the translations have gone: The translation team has decided not to ship translations anymore that are not translated at least about 80%. (See the commit message for the list of victims.) This is so that incidental users of stale translations are not presented with a confusing and distracting mix of translated and untranslated messages all the time. So right now we are only shipping a full or almost full set of translations into German, Spanish, French, Portuguese, and Turkish.

To get the translations into other languages back into the release, go to http://babel.postgresql.org/ and start submitting updates. Updates may be included as early as release 8.4.1 in a few months.

I hope in particular that we might get the Chinese, Italian, and Russian translations back into shape.

By the way, if you want to start (or continue) translating, I suggest that you approximately follow this priority order: libpq, psql, pgscripts, pg_dump, initdb, postgres. This or a similar order will make the translations useful to the most users with the least amount of work.
Meet the Meeting Ticker.
Posted by Pavel Golub on 2009-07-02 at 09:26:56

Preamble

Oh, it was quite long time for 8.4 version of PostgreSQL to finally get up on it’s feet and stand firmly – 16 months. After a sixteen-month pregnancy, the development team gave birth to a pretty elephant calf. Well done guys!

At the very same day development team of PostgresDAC – the newborn calf’s friend :) – decided to release PostgresDAC 2.5.2 Beta with support for 8.4 server features. And that’s not just an advertising words.

Crux of the matter

We’ve prepared v2.5.2 Beta with a lot of improvements. It was passed our internal tests but this is still beta version.

The main changes directly related to PostgreSQL 8.4 release would be:

  • 8.4 SSL Authentication is implemented
  • TPSQLRestore now supports 8.4 parallel restore
  • 8.4 clients libraries are used including dump\restore libs
  • TPSQLRestore and TPSQLDump 8.4 support includes (see Release Notes for details, section E.1.3.8.3):
    • roNoTablespace and doNoTablespace options added
    • doIgnoreVersion & roIgnoreVersion options marked as deprecated
    • TPSQLDump.LockWaitTimeout property added
    • Role properties added

From others features without a doubt should be mentioned:

  • TPSQLDataset.Options property added for fine tuning of component behavior
  • TPSQLDatabase.DesignOptions property added for absolute control over stored properties
  • Extended SQL editor for TPSQLQuery added with tables and fields list
  • Special TPSQLGuidField class added

Conclusion

It is worth noting that there were only two bug reports – and they were fixed – in this release and only one was developers’ fault, the other one appeared due to internal changes of Delphi 2009 after Update 3\4.

May the Force be with you, postgresmen!

Posted by Robert Gravsjö on 2009-07-02 at 08:54:26
Every time I see something or hear something like this I sigh a little bit. Not only when it's related to SQL but in the world of computer professionals in general. "The right tool for the job" seems to be a hard concept to understand sometimes. I wonder ...

Regular readers will know that I've been thinking a lot about testing SQL result sets and how to how to name result testing functions, and various implementation issues. I am very happy to say that I've now committed the first three such test functions to the Git repository. They've been tested on 8.4 and 8.3. Here's what I came up with.

Read More »

Posted by David Fetter on 2009-07-01 at 17:18:57
By now, you've probably seen that PostgreSQL 8.4 can produce Mandelbrot sets
like the one below, but what are Common Table Expressions really about?

Continue reading "WITH (so much drama in the CTE)"

PostgreSQL 8.4 has ANSI SQL:2003 window functions support. These are often classified under the umbrella terms of basic Analytical or Online Application Processing (OLAP) functions. They are used most commonly for producing cumulative sums, moving averages and generally rolling calculations that need to look at a subset of the overall dataset (a window frame of data) often relative to a particular row. For users who use SQL window constructs extensively, this may have been one reason in the past to not to give PostgreSQL a second look. While you may not consider PostgreSQL as a replacement for existing projects because of the cost of migration, recoding and testing, this added new feature is definitely a selling point for new project consideration.

If you rely heavily on windowing functions, the things you probably want to know most about the new PostgreSQL 8.4 offering are:

  • What SQL window functionality is supported?
  • How does PostgreSQL 8.4 offering compare to that of the database you are currently using?
  • Is the subset of functionality you use supported?

To make this an easier exercise we have curled thru the documents of the other database vendors to distill what the SQL Windowing functionality they provide in their core product. If you find any mistakes or ambiguities in the below please don't hesitate to let us know and we will gladly amend.

For those who are not sure what this is and what all the big fuss is about, please read our rich commentary on the topic of window functions.


Continue reading "Window Functions Comparison Between PostgreSQL 8.4, SQL Server 2008, Oracle, IBM DB2"
Posted by Andreas Scherbaum on 2009-07-01 at 17:00:00
Author
Andreas 'ads' Scherbaum

Up to PostgreSQL 8.3 it was only possible to grant (and revoke) permissions on the entire table. If column level permissions were needed, a workaround like a view solved (more or less) the problem: create the view with the required (allowed) columns, revoke all permissions from the underlaying table, grant permissions to the view.


This - of course - is uneloquent, error prone and does not scale well. For different users requiring access to different columns, a big number of views is needed.


PostgreSQL 8.4 solves the problem with a shiny new feature: column level permissions.




Continue reading "PostgreSQL 8.4: Column Permissions"
Posted by US PostgreSQL Association on 2009-07-01 at 16:35:50

JD wrote:

For those sleeping in PostgreSQL.org just released PostgreSQL 8.4. This is an exciting release with many new features including:

  • Parallel Database Restore, speeding up recovery from backup up to 8 times
  • Per-Column Permissions, allowing more granular control of sensitive data
  • Per-database Collation Support, making PostgreSQL more useful in multi-lingual environments
  • In-place Upgrades through pg_migrator (beta), enabling upgrades from 8.3 to 8.4 without extensive downtime

    read more

Posted by Robert Gravsjö on 2009-07-01 at 16:30:46
Spread the word, PostgreSQL 8.4 is out!
Posted by Josh Berkus on 2009-07-01 at 12:52:42
Now that PostgreSQL 8.4 is out, I thought I'd write a little about my favorite 8.4 feature. As Mister Performance Whack-a-Mole, what makes me happy about 8.4 is the ability to whack moles faster ... which is why I'm very fond of pg_stat_statements.

We have ccovered this briefly before, but its an important enough concept to cover again in more detail.

Problem: You are running out of disk space on the drive you keep PostgreSQL data on
Solution:

Create a new tablespace on a separate drive and move existing tables to it, or create a new tablespace and use for future tables.

What is a tablespace and how to create a tablespace

A tablespace in PostgreSQL is similar to a tablespace in Oracle and a filegroup in SQL Server. It segments a piece of physical disk space for use by the PostgreSQL process for holding data. Below are steps to creating a new tablespace. Tablespaces have existed since PostgreSQL 8.0.

More about tablespaces in PostgreSQL is outlined in the manual PostgreSQL 8.3 tablespaces

While it is possible to create a table index on a different tablespace from the table, we won't be covering that.


Continue reading "Managing disk space using table spaces"
Posted by Andrew Dunstan on 2009-06-28 at 12:24:52
I try to complete at least one significant feature item per PostgreSQL release. This time the feature is making pg_restore run in parallel. This is quite important for many users, particularly some large enterprise users.

It's important that people understand what this will do and what it won't do. pg_restores runs a number of steps. In conventional mode it simply runs them all in a single connection to the database, one after the other. In parallel mode it first runs all the quick and easy steps, essentially those that don't involve any data access, such as table and function creation, in a single connection, just like conventional mode. Then it runs the remaining steps each in its own connection. The steps are the same, and there is no parallelism within a given step. For example, a single COPY to a table is not parallelised. Rather, we run it in parallel with other data intensive steps.

The maximum amount of parellelism is controlled by the user. This will involve some experimentation to get to the sweet spot for your setup. A good place to start is the number of physical processors you have available. The idea here is to improve the situation where the CPU is the limiting factor, and allow you to drive the restoration rate up to where IO is in fact the limiting factor. With very high end hardware we believe that you can drive the parallelism quite high.

Like many performance features, this one might well require several releases to tweak it for optimal performance gain. The program works by keeping a pool of slots to be used for the steps that are run in parallel. One possible area for improvement is in the algorithm that selects the item to be used for a slot as it it becomes available. Currently we keep a queue of items that have no remaining unrestored dependencies. An item gets put on the queue as soon as all the items it depends on have been restored. This is likely to be a fairly good approximation of an optimal algorithm, but there might well be a way of tweaking it. Another possible area of optimsation would be to take some notice of the tablespace that each item affects, and try to balance these, so we use as many IO channels as possible.

What is important is that we have now got the basic framework of parallel restore, so that some researchers can easily experiment with various tweaks to improve the performance.

pg_restore is going to be with us for quiter a long time. Even if we manage to get pg_upgrade working pretty well, that will take quite a bit of time, and there is currently no guarantee that it will for for every release. So I expect pg_restore to be the most common method of upgrading for quite some time, making it run as fast as possible is thus still a significant requirement.

I'm proud to have been able to contribute this feature to Postgres, and look forward to other people improving it further as time goes by.
8.4.0 will be out soon. Meanwhile, keep testing! http://www.postgresql.org/developer/beta
Posted by Josh Berkus on 2009-06-27 at 17:31:36
I run a lot of lightning talks, and one tool I haven't been able to find a satisfactory solution for is the timer for the talks ...
Author
Andreas 'ads' Scherbaum

The PostgreSQL Project will have a dev-room at FrOSCon on sunday, august 23, 2009. Talks wanted!


The theme should be PostgreSQL-related, please submit the talk(s) by using the FrOSCon Pentabarf:


https://pentabarf.froscon.org/submission/froscon2009/


Procedure:



  1. Create account (if not yet done) and follow the confirmation link in the email

  2. Login in Pentabarf

  3. Create a new event

  4. Choose track "PostgreSQL"


All submitters will receive a confirmation timely, if the talk is accepted. Who wants to submit a talk about databases in general - or a talk about another database - may choose the "OpenSQL Camp" track.

Posted by Theo Schlossnagle in OmniTI on 2009-06-26 at 03:33:33

In perhaps a new trend, I’m blogging from 39011 feet (or so says the seatback in front of me). I’m traveling back home to the east coast from San Jose, CA where I attended (and spoke) at this year’s O’Reilly Velocity Conference.

I participated (and blogged) about the Velocity Summit in which I’ve participated for the past two years. The summit is the unconference preceding the real conference that help the organizers digest current hot topics and better define the conference track for the actual conference. The summit itself is filled with enough brain power to warp space-time, so I drop everything to go to that.

Ironically, despite being a well respected authority in web site (and general internet) scalability and performance, my talk proposals for Velocity 2008 were not accepted — I clearly need to write better proposals. This year, I managed to work my way into the workshop track on Monday. Despite having a bad headache and feeling "off" the day before, I managed to get my act together and put on an A-game for my workshop. For those of you interested, here is my scalable09 slide stack.

I thought I’d take a moment to talk about what I liked about the conference and what I think could use some improvement. I realize this is a down economy and that might be a legitimate justification for some the actions that resulted in some of my disappointment.

First, the negative. I usually start with positive and end with negative because I’m a pessimist. However, all in all the conference was awesome, so I thought I’d get my short list of gripes out of the way early.

O’Reilly is infamous for throwing good conferences for geeks. In my opinion, the field of web operations has been so severely neglected and applies so broadly to the world today that this conference needs to be for everyone.

  1. In the next conference, I’d love to see a technical business track. Several of the talks I went to spoke to the dollars and cents lost or earned by paying the right amount of attention to web site performance and better operational paradigms. I thought a lot of the topics would be very useful to business managers.
  2. The first day was not video taped and the second and third day were only half video taped. Come on guys, ante up. The attendance fee was substantial, you can afford to give your attendees the value of watching what they had to choose not to attend. I like the option when I go to a conference to choose a session that seems interesting so that I have the opportunity to participate, but often times I find that another session was top notch and I expect to be able to later review a recording of that.
  3. Lastly, and this is the most significant. While I thought the conference was extremely well executed (excellent job Jesse, Steve, all your support, and most definitely O’Reilly), it lacked sufficient PR and marketing outreach. I talked with several journalists (as a part of my normal day job) while I was at that conference and not one of them was aware of Velocity — simply embarrassing. Given that the Structure conference was in town that same week, O’Reilly should have invested more in their PR and marketing outreach. It would have resulted in a substantially increased audience and a better venue for teaching the world to run a faster web.

Now that I’ve griped and aired my disappointment. I can focus on the gobs of awesomeness that was Velocity.

  1. The conference was put on quite well from an operational perspective. Things started on time, A/V problems were non-existent. Like an idiot, I managed to lose my MacBook Air power adapter and the A/V crew managed to recover it for me. Conferences just plain suck when they have technical difficulties; this one had none.
  2. The two tracks at the conference were extremely well articulated and while I wanted to be in both all the time (as OmniTI is a full-stack company, we care about both equally) it was an excellent split.
  3. One track was performance which focused intently on user-perceived performance. This was largely front-end (HTML,CSS,JS,etc.) but also had a healthy amount of deep stack performance dis

[continue reading]

Posted by Kenny Gorman on 2009-06-26 at 00:28:04

At Hi5, we currently use pg_reorg1.0.3 in order to organize data in a clustered fashion. I posted previously about the strategy. Our version is slightly modified, the modifications I made to the C code essentially allow pg_reorg to spin/wait for locks on the objects to be released before proceeding.

The good news is the folks at NTT have incorporated a similar change in pg_reorg 1.0.4. This is a fantastic improvement, and frankly implemented in a cleaner way than my changes.

The crux of the issue is the situation where a database is being auto-vacuumed, you can’t be guaranteed that pg_reorg and the vacuum will not collide. In theory you should not need to vacuum a table which you are pg_reorg’ing because that is the point of a pg_reorg, it’s essentially a vacuum full w/ extra features because the table is being rebuilt from scratch. However in an environment where auto-vacuum is being utilized to keep tables vacuumed, both will need to co-exist.

The change is simple, use the NOWAIT option of lock table to fail if the lock can not be obtained. This is wrapped in a loop until the lock is granted. The effect is pg_reorg patiently sits and waits while your vacuums complete and then it can finish it’s work. The downside is if any of these operations run for too long, then the journal table may grow very large. So there should be some monitoring wrapped around the code if it’s intended to run in the background. For the future we need a backoff algorithm as well as perhaps a limit to the number of spin/sleep cycles, but hey this is excellent progress.

This tool is essential in my humble opinion for everyone running PostgreSQL in a high transaction/high availability environment. By the way, pg_reorg works seamlessly with Slony-I.

The code addition does the following:

for (;;)                        
        {
                command("BEGIN ISOLATION LEVEL READ COMMITTED", 0, NULL);
                res = execute_nothrow(table->lock_table, 0, NULL); 
                if (PQresultStatus(res) == PGRES_COMMAND_OK)
                {
                        PQclear(res);
                        break;
                }
                else if (sqlstate_equals(res, SQLSTATE_LOCK_NOT_AVAILABLE))
                {
                        /* retry if lock conflicted */ 
                        PQclear(res);
                        command("ROLLBACK", 0, NULL);
                        sleep(1);
                        continue;
                }
                else
                {
                        /* exit otherwise */
                        printf("%s", PQerrorMessage(connection));
                        PQclear(res);
                        exit(1);        
                }
        }

The below is a snip of the strace on pg_reorg while it’s waiting for the lock:

rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
sendto(3, "P\0\0\0008\0SELECT reorg.reorg_apply($"..., 529, 0, NULL, 0) = 529
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recvfrom(3, "1\0\0\0\0042\0\0\0\4T\0\0\0$\0\1reorg_apply\0\0\0\0"..., 16384, 0, NULL, NULL) = 77
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
sendto(3, "P\0\0\0\177\0SELECT 1 FROM pg_locks WHE"..., 178, 0, NULL, 0) = 178
rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recvfrom(3, "1\0\0\0\0042\0\0\0\4T\0\0\0!\0\1?column?\0\0\0\0\0\0\0"..., 16384, 0, NULL, NULL) = 74
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})

Your Postgresql log file will show the following:

Jun 25 17:09:33 <dbname> postgres[7825]: [37-2] 2009-06-25 17:09:33 PDTSTATEMENT:  LOCK TABLE <tablename> IN ACCESS EXCLUSIVE MODE NOWAIT
Jun 25 17:09:34 <dbname> postgres[7825]: [38-1] 2009-06-25 17:09:34 PDTERROR:  could NOT obtain LOCK ON relation "<tablename>"
Whenever I encounter a pervasive query performance problem, one of the first things I ask is, "is this database really normalized?" Usually the answer is no ...
Posted by Hubert Lubaczewski on 2009-06-25 at 13:04:44
I just updated explain.depesz.com with bugfix, which changes the way Bitmap Index Scan and Bitmap Heap Scan are displayed. Apparently index and table names were not shown previously. Thanks go to Viktor Rosenfeld for spotting and reporting the bug.
PostgreSQL Conference West 2009 Call for Papers June 24th, 2009, the PostgreSQL Conference U.S. team is pleased to announce the West 2009 venue and call for papers. This year the premiere West Coast PostgreSQL Conference will be leaving its roots at Portland State University and moving north to sunny Seattle, Washington. The event this year is being held at Seattle Central Community College from October 16th through 18th. The move to Seattle opens up a larger metropolitan area for continuing to expose databases users, developers, and administrators to the World's Most Advanced Open Source Database. Following previously successful West Coast conferences, we will be hosting a series of 3-4 hour tutorials, 90 minute mini-tutorials, and 45 minute talks. This year we will be continuing our trend of covering the entire PostgreSQL ecosystem. We would like to see talks and tutorials on the following topics: General PostgreSQL:
  • Administration
  • Performance
  • High Availability
  • Migration
  • GIS
  • Integration
  • Solutions and White Papers
The Stack:
  • Python/Django/Pylons/TurboGears/Custom
  • Perl5/Catalyst/Bricolage
  • Potato
  • Ruby/Rails
  • Java (PLJava would be great)/Groovy/Grails
  • Operating System optimization (Linux/FBSD/Solaris/Windows)
  • Solutions and White Papers
If you are using PostgreSQL as your platform, you need to be presenting at this conference!
Submit your talk (You must be have an account on the site)
*** The PostgreSQL Conference U.S. series is an autonomous Educational Project used to educate all comers on the use of The World's Most Advanced Open Source Database. Proceeds from the event are donated directly to United States PostgreSQL; the 501c3 non-profit for PostgreSQL education and advocacy in the United States.
Posted by Josh Berkus on 2009-06-24 at 18:13:32
In addition to the pgDay San Jose, there are several talks at OSCON which will be of interest to PostgreSQL community ...
Posted by US PostgreSQL Association on 2009-06-23 at 14:22:55

Michael Brewer wrote:

On Saturday, June 13th, I wound up manning the PostgreSQL booth at SouthEast LinuxFest, in Clemson, South Carolina. This free conference drew a larger crowd than I'd expected; organizers told me there had been some 450 registrants by the day before, and they were expecting a final total of over 500 (with walk-ups).

read more

Let’s say you imported some data, but it contains duplicates. You will have to handle them in some way, but to make sensible choice on how to handle it, you need more information. So, let’s start. We have table: # \d users [...]
Posted by Dimitri Fontaine on 2009-06-23 at 08:53:00

At long last, after millions and millions of queries just here at work and some more in other places, the prefix project is reaching 1.0 milestone. The release candidate is getting uploaded into debian at the moment of this writing, and available at the following place: prefix-1.0~rc1.tar.gz.

If you have any use for it (as some VoIP companies have already), please consider testing it, in order for me to release a shiny 1.0 next week! :)

Recent changes include getting rid of those square brackets output when it's not neccesary, fixing btree operators, adding support for more operators in the GiST support code (now supported: @>, <@, =, &&). Enjoy!

Posted by David Wheeler on 2009-06-22 at 17:34:00

I've been thinking more about testing SQL result sets and how to name functions that do such testing, and I've started to come to some conclusions. Some of the constraints I'm looking at:

Read More »

Posted by Baron Schwartz on 2009-06-21 at 13:48:49

Last weekend, my brother and I attended SELF 2009. A few thoughts on it:

The mixture of sessions was interesting. There were some really good ones. I think the best session I attended was an OpenSolaris/NetBeans/Glassfish/Virtualbox/ZFS session, given by a Sun employee. He was an excellent presenter, and really showed off the strengths of the technologies in a nice way. He started up enough VMs to make his OpenSolaris laptop chew into swap, and I thought it was fun to see how it dealt with that. I’ve heard Solaris and OpenSolaris do a lot better at avoiding and managing swapping than GNU/Linux, but I couldn’t make any opinion from watching. I did think it was odd to have this session at a “Linux” (yes, they left off the GNU) conference. But I thought the session was a good addition to the conference. In other sessions, and in the hallways and expo, there was a lot more slant towards open-source software and gadgetry in general than there was towards GNU/Linux. The sessions that were about Linux or GNU/Linux were top-heavy towards topics like educational initiatives.

The Free Software Foundation had a booth in the expo hall. It was funny that they didn’t boycott the event, because I know RMS won’t speak at so-called “Linux User Groups” and insists they be called “GNU/Linux User Groups.” I guess the FSF is not unified behind that banner. Regardless, I used the opportunity to renew my membership perpetually. I’m so lazy that I need something like this to stay involved!

The expo hall was dominated by Red Hat, Fedora, and SUSE; PostgreSQL was there, but not MySQL. There was a good variety and number of vendors. It was great to see the healthy support of the event, which was free, by the way.

Clemson, SC is not easy to get to, and while the Clemson campus was attractive and functioned fine, it’s nothing you can’t find elsewhere. I ended up driving over 9 hours to get to it. I’d have preferred the technology triangle, which if nothing else is close to major airports, bus and train stops, and Red Hat.

Richard Hipp talked about the great fsync() bug, a similar talk to the one he gave at the first OpenSQL Camp. Someone asked about Tokyo Cabinet and he responded that he hasn’t found any fsync() calls in its source code. *cough* Something worth thinking about for on-disk usage (I haven’t looked at its source much myself). TC can also be used in-memory-only, and a while back I suggested that usage of it for Drizzle to replace the Memory engine; I don’t know what became of that.

The Call for Papers for pgday.eu is out. Submit! http://2009.pgday.eu/
Posted by Robert Hodges on 2009-06-20 at 17:27:34
They sometimes go bad in completely unpredictable ways. Here's a problem I have now seen twice in production situations. A host boots up nicely and mounts file systems from the SAN. At some point a SAN switch (e.g., through a Fibrechannel controller) fails in such a way that the SAN goes away but the file system still appears visible to applications.

This kind of problem is an example of a Byzantine fault where a system does not fail cleanly but instead starts to behave in a completely arbitrary manner. It seems that you can get into a state where the in-memory representation of the file system inodes is intact but the underlying storage is non-responsive. The non-responsive file system in turn can make operating system processes go a little crazy. They continue to operate but show bizarre failures or hang. The result is problems that may not be diagnosed or even detected for hours.

What to do about this type of failure? Here are some ideas.
  1. Be careful what you put on the SAN. Log files and other local data should not go onto the SAN. Use local files with syslog instead. Think about it: your application is sick and trying to write a log message to tell you about it on a non-responsive file system. In fact, if you have a robust scale-out architecture, don't use a SAN at all. Use database replication and/or DRBD instead to protect your data.
  2. Test the SAN configuration carefully, especially failover scenarios. What happens when the host fails from access one path to another? What happens when another host picks up the LUN from a "failed" host? Do you have fencing properly enabled?
  3. Actively look for SAN failures. Write test files to each mounted file system and read them back as part of your regular monitoring. That way you know that the file system is fully "live."
The last idea gets at a core issue with SAN failures--they are rare, so it's not the first thing people think of when there is a problem. The first time this happened on one of my systems it was around 4am in the morning. It took a really long time to figure out what was going on. We didn't exactly feel like geniuses when we finally checked the file system.

SANs are great technology, but there is an increasingly large "literature" of SAN failures on the net, such as this overview from Arjen Lentz and this example of a typical failure. You need to design mission-critical systems with SAN failures in mind. Otherwise you may want to consider avoiding SAN use entirely.
Posted by Scott Bailey on 2009-06-19 at 21:15:23
scottrbailey