Hacker News new | ask | show | jobs
by ratorx 700 days ago
One problem with file based backups is that they are not atomic across the filesystem. If you ever back up a database (or really any application that expects atomicity while it’s running), then you might corrupt the database and lose data. This might not seem like a big problem, but can affect e.g. SQLite, which is quite popular as a file format.

Then again, the likelihood that the backup will be inconsistent is fairly low for a desktop, so it’s probably fine.

I think the optimal solution is:

1) file system level atomic snapshot (ZFS, BTRFS etc)

2) Backup the snapshot at a file level (restic, borg etc)

This way you get atomicity as well as a file-based backup which is redundant against filesystem-level corruption.

4 comments

I agree with you, of course. On macOS, Arq uses APFS snapshots, and on Windows, it uses VSS. It'd be nice to use something similar on Linux with restic.

In my linked post above, I wrote about this:

"You might think btrfs and zfs snapshots would let you create a snapshot of your filesystem and then backup that rather than your current live filesystem state. That’s a good idea, but it’s still an open issue on restic for something like this to be built-in (link). There’s a proposal about how you could script it with ZFS in this nice article (link) on the snapshotting problem for backups."

The post contains the links with further information.

My imperfect personal workaround is to run the restic backup script from a virtual console (TTY) occasionally with my display server / login manager service stopped.

I run this from a ZFS snapshot. What I want backed up from my home dir lives on the same volume, so I don't have to launch restic multiple times. I have dedicated volumes for what I specifically want excluded from backups and ZFS snapshots (~/tmp, ~/Downloads, ~/.cache, etc).

I've been thinking of somehow triggering restic by zrepl whenever it takes a snapshot, but I haven't figured a way of securely grabbing credentials for it to unlock the repository and to upload to s3 without requiring user intervention.

You can also use lvm2 and then you get atomic snapshots with any file system (I think it needs to support fsfreeze, I guess all of them do).
I never knew this. Thanks for sharing!
lvm requires unallocated space in the volume which makes it kind of garbage to use for snapshots
Personally I've never found this to be issue, as I increase volume sizes base on need, not allocate 100% from the get-go. The space needed for short-lived snapshots is not that big, though that of course can depend on the system.

This also helps dealing with run-away (or long running) processes eating disk space, as you always have some extra space set aside..

Only a little (as much as data will change during the backup). And default filesystems nowadays support resizing downwards so you can make space after initial partitioning.
You have to know in advance to not allocate 100% to root and home otherwise you are SOL when you want to make space later. If you're lucky you can disable swap and temporarily use its allocation to do it, providing that is large enough for the changes.
This is not the case, as like I said you can shrink either of those filesystems and its container and use the freed space for this.

(Also I think lvm doesn't need the volume blocks to be contiguous on the physical volume. So you might have N free space after volume a and M after volume b, and lvm would let you create a new N+M sized volume.)

So when I have root and home mounted because I am using the computer can I shrink them? No because they are mounted.
Windows' Volume Shadow Copy Service[1] allows applications like databases to be informed[2] when a snapshot is about to be taken, so they can ensure their files are in a safe state. They also participate in the restore.

While Linux is great at many things, backups is one area I find lacking compared to what I'm used to from Windows. There I take frequent incremental whole-disk backups. The backup program uses the Volume Shadow Copy Service to provide a consistent state (as much as possible). Being incremental they don't take much space.

If my disk crashes I can be back up and running like (almost) nothing happened in less than an hour. Just swap out the disk and restore. I know, as I've had to do that twice.

[1]: https://learn.microsoft.com/en-us/windows/win32/vss/the-vss-...

[2]: https://learn.microsoft.com/en-us/windows/win32/vss/overview...

LVM snapshots are copy on write and can be used the same way.
Any backup software that utilizes LVM in this way?

Ie automatically creates a snapshot and sends the incremental changes since previous snapshot to a backup destination like a NAS or S3 blob storage.

wyng backup does this. It uses the device mappers thin_dump tools to allow for incremental backups between snapshots, too:

https://github.com/tasket/wyng-backup

edit: requires lvm thin provisioned volumes

There is also thin-send-recv which basically does the same as zfs send/recv just with lvm:

https://github.com/LINBIT/thin-send-recv

it uses the same functions of the device mapper to allow incremental sync of lvm thin volumes.

Thanks for the pointers, looks very relevant.

It's just such a low-effort peace of mind. Just a few clicks and I know that regardless what happens to my disk or my system, I can be up and running in very little time with very little effort.

On Linux it's always a bit more work, but backups and restore is one of those things I prefer is not too complicated, as stress level is usually high enough when you need to do restore to worry about forgetting some incantation steps.

it depends. Doing a complete disaster recovery of a windows system IMHO can be a real struggle. Especially if you have to restore a system to different hardware, which the system state backup that microsoft offers does not support afaik.

Backing up a linux system in combination with REAR:

https://github.com/rear/rear

and a backup utility of your choice for the regular backup has never failed me so far. I used it to restore linux systems to complete different hardware without any troubles.

I don't think the diffs are usable that way. They're actually more like an "undo log" in that the snapshot space is taken by "old blocks" when the actual volume is taking writes. It's useful for the same reasons as volume shadow copy: a consistent snapshot of the block device. (Also this can be very bad for write performance as any writes are doubled - to snapshot and to to the real device)
Yeah ok, that makes sense. Write performance is a concern, but usually the backups run when there's little activity.
I think block-level snapshots would be very difficult to use this way.

I just make a full dedupped backups from LVM snapshots with kopia, but I've set that up only on one system, on others I just use kopia as-is.

It takes some time, but that's fine for me. Previous backup of 25 GB an hour ago took 20 minutes. I suppose if it only walked files it knew were changed it would be a lot faster.

Thanks, sounds interesting. So you create a snapshot, then let kopia process that snapshot rather than the live filesystem, and then remove the snapshot?

> I suppose if it only walked files it knew were changed it would be a lot faster.

Right, for me I'd want to set it up to do the full disk, so could be millions of files and hundreds of GB. But this trick should work with other backups software, so perhaps it's a viable option.

Exactly so.

Here's the script, should it be of benefit to someone, even if it of course needs to be modified:

    #!/bin/sh
    success=false
    teardown() {
      umount /mnt/backup/var/lib/docker || true
      umount /mnt/backup/root/.cache || true
      umount /mnt/backup/ || true
      for lv in root docker-data; do
        lvremove --yes /dev/hass-vg/$lv-snapshot || true
      done
    
      if [ "$1" != "no-exit" ]; then
        $success
        exit $?
      fi
    }
    
    set -x
    set -e
    teardown no-exit
    trap teardown EXIT
    for lv in root docker-data; do
      lvcreate --snapshot -L 1G -n $lv-snapshot /dev/hass-vg/$lv
    done
    
    mount /dev/hass-vg/root-snapshot /mnt/backup
    mount /dev/hass-vg/docker-data-snapshot /mnt/backup/var/lib/docker
    mount /root/.cache /mnt/backup/root/.cache -o bind
    
    chroot /mnt/backup kopia --config-file="/root/.config/kopia/repository.config" --log-dir="/root/.cache/kopia" snap create / /var/lib/docker
    kopia --config-file="/root/.config/kopia/repository.config" --log-dir="/root/.cache/kopia" snap create /boot /boot/efi
    success=true
While I do that, is that really the case? I can imagine database snapshots are consistent most of the time, but it can't be guaranteed, right? In the end it's like a server crash, the database suddenly stops.
Your DB is supposed to guarantee consistency even in server crashes. (The Consistency, Durability part of ACID).
That consistency is built on assumptions about the filesystem that may not hold true of a copy made concurrently by a backup tool.

e.g. The database might append to write-ahead logs in a different order than the order in which the backup tool reads them.

That's why you do a filesystem snapshot before the backup, something supported by all systems. The snapshot is constant to the backup tool, and read order or subsequent writes don't matter.

The main difference is that Windows and MacOS have a mechanism that communicates with applications that a snapshot is about to be taken, allowing the applications (such as databases) to build a more "consistent" version of their files.

In theory, of course, database files should always be in a logically consistent state (what if power goes out?).

> something supported by all systems

Well, supported by Windows and MacOS. Linux only if you happen to use zfs or btrfs, and also only if the backup tool you use happens to rely on those snapshots.

I believe basically any filesystem will work if you have it on LVM. Bonus of lv snaps being thin snapshots too
That works if the backup uses a snapshot of the filesystem or a point in time. Then the backup state is equivalent to what you'd get if the server suddenly lost power, which all good ACID databases handle.

The GP is talking about when the backup software reads database files gradually from the live filesystem at the same time as the database is writing the same files. This can result in an inconsistent "sliced" state in the backup, which is different from anything you get if the database crashes or the system crashes or loses power.

The effect is a bit like when "fsync" and write barriers are not used before a server crash, and an inconsistent mix of things end up in the file. Even databases that claim to be append-only and resistant to this form of corruption usually have time windows where they cannot maintain that guarantee, e.g. when recycling old log space if the backup process is too slow.