Hacker News new | ask | show | jobs
by martinrame 345 days ago
What about ZFS Snapshots and send/recv for backup and restore?. For us this is the cleanest approach, since we use it not only for PostgreSQL, but for all the data in our organization. Of course, the underlying filesystem must be ZFS.
3 comments

I guess it all depends on your requirements, since this would still cause data loss for the delta time between failure and your last snapshot, but I'm a huge fan of ZFS, and it might be one reason to try out Postgres on FreeBSD, since the only Linux distro that ships ZFS painlessly out of the box is Ubuntu to my knowledge.

I'm also curious how Distributed Replicated Block Device (DRBD) would perform, it would cause obvious latency but perhaps it would be an easier and more efficient solution for a "hot spare" setup than using Postgres native functionality. To my understanding, DRBD can be configured to protect you from hardware IO errors by "detaching" from an erroring disk.

I also don't know if it's a valid point, but I've heard people say that you don't want a fancy CoW filesystem for databases, since much of the functionality offered are things that databases already solve themselves, so you might be sacrificing performance for safety from things that "should not happen"(tm) anyway, depending on how it's set up I guess.

I agree with your overall point. That said: ZFS on Debian is pretty painless. If you have to build/link the kernel, apt will do it all for you, so you don't have to do anything.

ZFS on NixOS is usually quite easy as well, even on / : https://wiki.nixos.org/wiki/ZFS

On the Xata platform we actually do CoW snapshots and branching at the block device level, which works great.

However we are developing pgstream in order to bring in data and sync it from other Postgres providers. pgstream can also do anonymisation and in the future subsetting. Basically this means that no matter which Postgres service you are using (RDS, CloudSQL, etc) you can get still use Xata for staging and dev branches.

Or btrfs. I also think that filesystem snapshots are underrated backup strategy, assuming your data fits on one disk (which should be the case for almost all applications outside of FAANG).
Why would btrfs or btrfs snapshot require single disks? My btrfs is combination of different size disks bought over time (3T to 24T) and snapshots works just fine. I've configured it to use raid with 2 copies for data and 3 for metadata.