Hacker News new | ask | show | jobs
by dale_glass 699 days ago
> Yes, I did test my RAM, I know it's fine. For comparison, I've (unintentionally) ran a ZFS system with bad RAM for years and it only manifested as an occasional checksum error.

Just luck. Software can't defend itself against bad RAM. There's always the possibility that bad RAM will cause ZFS to corrupt itself in some way it can't recover itself from.

Everything is in RAM. The kernel, the ZFS code, everything. All of that is vulnerable to corruption. No matter how fancy ZFS is, it can't stop its own code from being corrupted. It's just luck that it didn't happen.

2 comments

Well, yes and no. The amount of RAM consumed by the filesystem driver is negligible compared to the truckloads of filesystem data shoveled through it. If we assume that errors are comparatively rare, the code itself is unlikely to be affected. Even if you're unlucky enough to get RAM corruption in the 0.01% occupied by the ZFS driver, the chance that a bit will flip in just such a way as to make a checksum succeed when it should have failed due to a second bit flip is virtually nonexistent. Much more likely that it simply crashes in some way. As such ZFS is much more resilient to on-disk filesystem corruption from bad RAM than systems which don't do any checksumming at all.
ECC RAM helps