|
|
|
|
|
by Xamayon
1405 days ago
|
|
The good thing about filesystems which have no error checking is that they won't generally go back and corrupt the old, idle, already written data on disk when they have a memory issue. If you run a scrub/whatever to check validity of old data and your system has a new memory issue, it could destroy everything very quickly even without new file writes. With a less feature full filesystem your data corruption could still be quite bad, but it would have a better chance for recovery as even if the filesystem is severely damaged, the file data would be (mostly) untouched. ECC helps prevent both of these cases, it's really too bad it's so hard/expensive to use outside of server hardware. |
|
No, it won't.
https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-y...
> Let’s assume that we have RAM that not only isn’t working 100% properly, but is actively goddamn evil and trying its naive but enthusiastic best to specifically kill your data during a scrub. First, you read a block. This block is good. It is perfectly good data written to a perfectly good disk with a perfectly matching checksum. But that block is read into evil RAM, and the evil RAM flips some bits. Perhaps those bits are in the data itself, or perhaps those bits are in the checksum. Either way, your perfectly good block now does not appear to match its checksum, and since we’re scrubbing, ZFS will attempt to actually repair the “bad” block on disk. Uh-oh! What now?
> Next, you read [a copy of the same block from another disk]. Now, if your evil RAM leaves this block alone, ZFS will see that the second copy matches its checksum, and so it will overwrite the first block with the same data it had originally – no data was lost here, just a few wasted disk cycles. OK. But what if your evil RAM flips a bit in the second copy? Since it doesn’t match the checksum either, ZFS doesn’t overwrite anything. It logs an unrecoverable data error for that block, and leaves both copies untouched on disk.