Hacker News new | ask | show | jobs
by andruby 3225 days ago
It does continuous backup like WAL-E.

Both back up PG's WAL files (Write Ahead Log) and allow restoring your database state as it was at a specific time or after a specific transaction committed. This is known as point-in-time recovery (PITR) [0]

Users and admins make mistakes, and accidentally delete or overwrite data. With PITR you can restore in a new environment, just before the mistake occurred and recover the data from there.

[0] https://www.postgresql.org/docs/9.6/static/continuous-archiv...

1 comments

What I meant is that the archive_command is run only when a WAL segment is completed or when archive_timeout is reached. In the meantime, nothing is backed up. On a low traffic database, this can be a problem. I'm wondering if there is a way to continuously stream the WAL to an object storage like S3, without waiting to have a complete segment.
S3 is a block store; not something you can really stream to.

However it might be interesting to stream WAL logs to e.g. AWS Kinesis....

You can open multi-part transfers and close out the transfer when you're ready, which can be used so that it is very close to streaming; for this case perhaps it's close enough to try with wall-g if it otherwise supports it.
You're right. S3 is an object store and doesn't support the append operation, which is required for what I want to do. Thanks!
That's the usecase for archive_timeout. I set it to 60 seconds. So at most I'll have lost 60s + the time to transfer the file to s3, which shouldn't be more than a couple seconds.
According to PostgreSQL documentation, "archived files that are archived early due to a forced switch are still the same length as completely full files".

I'm afraid to use a lot of storage for WAL segments that are mostly empty:

16 MB per segment x 60 minutes x 24 hours x 7 days = 161 GB/week

Does WAL-G/WAL-E compression help?

Yes, the lzop compression helps a lot, and I imagine that mostly "empty files" will be more compressible.

On a staging server with little activity, the compressed WAL-E wal files go as low as 1.9MB per 10 minutes. (~2GB/week)

The production server has files between 4 and 12MB per 1 minute or less. (~220GB/week)

WAL-E has a good `wal-e delete retain` command that removes older base backups and wal files.

2 GB/week is so much better than 161 GB/week! It looks like compression helps a lot. Thanks for sharing these numbers.