Hacker News new | ask | show | jobs
by takeda 2340 days ago
If the database is PostgreSQL, I would strongly advise about forgetting about filesystem snapshots and instead using streamed backups (if on premises use barman, if in cloud WAL-E or WAL-G (never used it but looks like improvement over WAL-E).

This gives you backup with a replay value, so you can restore at any point in time. You can also use such backup for setting up replication. There's still a daily backup which is there to speed up recovery and increase resiliency. Those backups don't really put much load on the database, but if that's a concern you can back up the replica (which is what cloud providers or at least AWS is doing).

As for ZFS, out of the box ZFS is not a good file system for databases, although you can get a good performance after tuning. You for example want to configure it to have block sizes alizened with database blocks, configure ZIL, perhaps changing block hashing algorithms (although I think current default should be fast).

As for your question how are cloud providers are doing it, most of us can speculate. To me it looks like standard RDS instances are simply on EBS (which are utilizing S3). In Aurora they skipped EBS and implemented another database storage directly.

It seems like the backups are performed in traditional way though.