Hacker News new | ask | show | jobs
by yjftsjthsd-h 351 days ago
> Which implies you can already correct errors through a simple majority mechanism.

I don't think so? You set copies=2, and the disk says that your file starts with 01010101, except that the second copy says your file starts with 01010100. How do you tell which one is right? For that matter, even with only one copy ex. ZFS can tell that what it has is wrong even if it can't fix it, and flagging the error is still useful.

> So just by having the appropriate level of RAID you automatically solve the problem. Why is this in the fs layer then?

Similarly, you shouldn't need RAID to catch problems, only (potentially) to correct them. I do agree that it doesn't necessarily have to be in the FS layer, but AFAIK Linux doesn't have any other layers that do a good job of it (as mentioned above, dm-integrity exists but halving the write speed is a pretty big problem).

1 comments

> I don't think so?

The disk is going to report an uncorrected error for one of them.

> The disk is going to report an uncorrected error for one of them.

Emperical evidence has shown otherwise: I have regularly gotten checksum error reports that ZFS has complained about during a scrub.

The ZFS developers have said in interviews that disks, when asked from LBA 123 have returned the contents of LBA 234 (due to disk firmware bugs): the on-disk checksum for 234 is correct, and so the bits were passed up the stack, but that's not the data that the kernel/ZFS asked for. It is only be verifying at the file system layer than the problem was caught (because at the disk layer things were "fine").

A famous paper that used Google's large quantity of drives as a 'sample population' mentions file system-level checks:

* https://www.cs.toronto.edu/~bianca/papers/fast08.pdf

See also the Google File System paper (ยง5.2 Data Integrity):

* https://research.google/pubs/the-google-file-system/

Trusting drives is not wise.