Hacker News new | ask | show | jobs
by cookiecaper 3418 days ago
A snapshot could be a backup depending on what you're calling a snapshot, but yeah, in general, to be a backup things need to have these features:

1. stored on separate infrastructure so that obliteration of the primary infrastructure (AWS account locked out for non-payment, password gets stolen and everything gets deleted, datacenter gets eaten by a sinkhole, etc.) doesn't destroy the data.

2. offline, read-only. This is where most people get confused.

Backups are unequivocally NOT a live mirror like RAID 1, slightly-delayed replication setup like most databases provide, or a double-write system. These aren't backups because they make it impossible to recover from human errors, which include obvious things like dropping the wrong table, but also less obvious things, like a subtle bug that corrupts/damages some records and may take days or weeks to notice. Your standbys/mirrors are going to copy both of obvious and non-obvious things before you have a chance to stop them.

This is one of the most important things to remember. Redundancy is not backup. Redundancy is redundancy and it primarily protects against hardware and network failures. It's not a backup because it doesn't protect against human or software error.

3. regularly verified by real-world restoration cases; backups can't be trusted until they're confirmed, at least on a recurring, periodic basis. Automated alarms and monitoring should be used to validate that the backup file is present and that it is within a reasonable size variance between human-supervised verifications. Automatic logical checksums like those suggested by some other users in this thread (e.g., run pg_restore on a pg_dump to make sure that the file can be read through) are great too and should be used whenever available.

4. complete, consistent, and self-contained archive up to the timestamp of the backup. Differenced backups count as long as the full chain needed for a restoration is present.

This excludes COW filesystem snapshots, etc., because they're generally dependent on many internal objects dispersed throughout the filesystem; if your FS gets corrupted, it's very likely that some of the data referenced by your snapshots will be corrupted too (snapshots are only possible because COW semantics mean that the data does not have to be copied, just flagged as in use in multiple locations). If you can export the COW FS snapshot as a whole, self-contained unit that can live separately and produce a full and valid restoration of the filesystem, then that exported thing may be a backup, but the internal filesystem-local snapshot isn't (see also point 1).