| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hashworks 702 days ago
	While I do that, is that really the case? I can imagine database snapshots are consistent most of the time, but it can't be guaranteed, right? In the end it's like a server crash, the database suddenly stops.

2 comments

lmz 702 days ago

Your DB is supposed to guarantee consistency even in server crashes. (The Consistency, Durability part of ACID).

link

mdavidn 702 days ago

That consistency is built on assumptions about the filesystem that may not hold true of a copy made concurrently by a backup tool.

e.g. The database might append to write-ahead logs in a different order than the order in which the backup tool reads them.

link

grumbelbart2 702 days ago

That's why you do a filesystem snapshot before the backup, something supported by all systems. The snapshot is constant to the backup tool, and read order or subsequent writes don't matter.

The main difference is that Windows and MacOS have a mechanism that communicates with applications that a snapshot is about to be taken, allowing the applications (such as databases) to build a more "consistent" version of their files.

In theory, of course, database files should always be in a logically consistent state (what if power goes out?).

link

Sakos 702 days ago

> something supported by all systems

Well, supported by Windows and MacOS. Linux only if you happen to use zfs or btrfs, and also only if the backup tool you use happens to rely on those snapshots.

link

c45y 702 days ago

I believe basically any filesystem will work if you have it on LVM. Bonus of lv snaps being thin snapshots too

link

jlokier 702 days ago

That works if the backup uses a snapshot of the filesystem or a point in time. Then the backup state is equivalent to what you'd get if the server suddenly lost power, which all good ACID databases handle.

The GP is talking about when the backup software reads database files gradually from the live filesystem at the same time as the database is writing the same files. This can result in an inconsistent "sliced" state in the backup, which is different from anything you get if the database crashes or the system crashes or loses power.

The effect is a bit like when "fsync" and write barriers are not used before a server crash, and an inconsistent mix of things end up in the file. Even databases that claim to be append-only and resistant to this form of corruption usually have time windows where they cannot maintain that guarantee, e.g. when recycling old log space if the backup process is too slow.

link