| This is one of the things about being a web developer and part-time DBA that keeps me up at night (sometimes literally all night). Around a month ago the source file table on Coveralls.io[0] hit a transaction wraparound, and put us in read-only mode for over 24 hours while RDS engineers performed a full vacuum (that's not something that can be done manually as a customer). On a managed database I'm paying thousands a month for, I was hoping there would be some sort of early warning system. Well, apparently there is, but it's buried in the logs, and won't trigger any app exceptions so went un-noticed. What's worse is there's 0 indication of how long a vacuum is going to take, nor progress updates while it's going. So for a production web app with customers, this means damage control language like: "Our engineers have identified a database performance issue and working to mitigate. Unfortunately we do not have an ETA at this time." About a week later, more calamity hit: the INT "id" field on the same table exceeded the max length. My first thought was change it to a BIGINT, but after ~4 hrs into the migration without any indication of how much longer it would take, I pulled the plug and sharded the table instead. Moral of the story is that web devs should be aware of these pitfalls, and that no matter how much trust you put into a managed database service, it still could happen to you (queue ominous background music). Anyway I'm glad to see this lurking monster in our beloved database tamed, thank you Mr Haas! [0] https://coveralls.io |
Upcoming 9.6 will help with this to a certain degree: http://www.postgresql.org/docs/devel/static/progress-reporti...