Hacker News new | ask | show | jobs
by Symbiote 3352 days ago
I add parity archives to add some redundancy to my photographs. I've done it for years, but I think it's useful not to rely on a filesystem to handle this.

For example:

  par2create -r5 -n2 example.par2 *.jpg
creates two files, between them giving 5% redundancy. I think that should be more than enough to repair bitrot within a file, but depending how many photos there are, losing a whole file could be more than 5%.

  par2verify example.par2
will verify, and par2repair will repair corrupted or missing files.

https://en.wikipedia.org/wiki/Parchive

4 comments

Ha, I thought of writing a similar ad-hoc checksuming tool so many times. I should have checked. I now wonder how feasible it is to embed them invisibly in metadata fields )
It would be neat if there was a parity scheme fast enough to preserve 2% of of all files on disk. It could even be tucked away behind savings from file-level compression.
> it's useful not to rely on a filesystem to handle this

In what way?

Relatively few filesystems offer thorough data checksumming. Hardly any offer erasure coding. RAID at the filesystem layer is a bit more common, but also more inconvenient. Doing erasure coding at the file archive level rather than in the filesystem gives you the freedom to move your archives onto standard everyday filesystems and devices without silently losing the protection.
You don't need _a lot_ of filesystems to offer it - just the one you use - and if this is important to you, just use ZFS.
What if I use multiple computers and multiple operating systems want to be able to work on this data on them. Par2 would allow me to create the needed data on any computer and test it on any computer.

The other alternative is having a server that's up and running all the time, exposed to the internet (or complicate the setup with a VPN), so I can sync. Operations would take a long time (via the internet) or I would anyway need to transfer the data to my computer, work on it, sync it back. During this time, any protections that ZFS offers are null since anything could happen in my computer and I can't test for it locally.

ZFS is great. But it's not the answer to everything.

>You don't need _a lot_ of filesystems to offer it - just the one you use

Only if you use it everywhere. Including on your laptop.

How often have you had to, and been able to, correct data using your parity archive?
I think I've needed it once, and I was able to correct the error.

I have a cron job which runs every month and verifies each photo album; that picked up the error.