Hacker News new | ask | show | jobs
by bgro 912 days ago
It’s always amazing to me how frequently backups silently fail. Every backup software or general common tool to back things up that I’ve seen has many points of silent failure where it just gives up copying at some point in the process or skips over files for some reason without indicating what or why.

If you don’t delete files as you go, now you have an unknown partial backup state that basically doubles your needed space.

If you delete as you go, sometimes something happens and the process stops or corrupts so your data is now split and you may have lost something.

Even trying to log all the failures during the process is amazingly difficult and solutions to work around that specific problem, themselves, somehow introduce more and new types silent failure in some type of irony.

1 comments

Yes! The worst is that even if you set up all kinds of reports etc. on what you expect, if the backup runs for weeks/months successfully, you just stop paying attention and then when something fails, you won't notice it.

I do think that file systems that support snapshots - like ZFS, but I think LVM can be used for stuff like ext4, and Apple APFS does too - is the way to go. Not sure how well NTFS's Shadow Copies/Volume Shadow Service work, I heard horror stories, but not sure if those are one-off freak accidents. Probably worth considering ReFS anyway these days on a Windows Server. But with a Snapshot, you're at least insulating yourself mostly from changes to the data you're backing up. At the expensive of managing snapshots, that is, getting rid of old ones after a while because they keep taking up space.

(Edit: Though a snapshot of the file system isn't enough if you need to back up services that are currently running. E.g., a database server might have stuff uncommitted in memory that wouldn't be captured by a file system snapshot. But database backups are their own beast to wrangle.)