| So this is a technical problem I am having right now that's preventing me from backing up a Postgres database completely (hope someone here can help). I have a master Postgres database that is receiving a TON of transactions per second (I'm talking about thousand concurrent transactions). We tried running pg_dump on this database, but the DB is just too huge, and it took more than 4 days to completely dump out everything. Not only that but it impacted performance to the point where backing it up was just not feasible. No problem.. just create a slave-DB and run pg_dump on that, right? We did just that, but the problem is that you can't run long running queries on a hot standby (queries that take more than a minute). What would you do in my scenario? With the hot standby, I technically am backing up my data, but I would have 100% piece of mind if I could daily backups in case someone accidentally ran a "DROP DATABASE X", which would also delete the hot standby/slave db as well. |
http://www.postgresql.org/docs/9.0/static/runtime-config-wal... See max_standby_archive_delay and max_standby_streaming_delay, -1 lets them wait forever.
Alternately, you can issue pg_start_backup('label'), backup the filesystem, then issue pg_stop_backup() and keep all the WAL logs from that time. That'll get you a base backup similar to the slave.
What I'm doing is this:
I've got a primary/hot spare pair, and a tertiary db on lesser equipment that's my second copy for cases where I have one of the main machines down or I have to rebuild the secondary from the primary.
The tertiary db ships logs to s3, after gpging them. Every $timeframe, I take a base backup and throw it up as well. I keep a couple, and delete the older ones. Every few months, I test a restore on ec2. There's a balance between the WAL logs that you need to keep, the time to restore, and the frequency of base backups.
[edit - parameter names. Further edit - strategy.]