Hacker News new | ask | show | jobs
by londons_explore 1079 days ago
> as I have had to recreate a 16tb array twice now due to journal corruption on power outages.

This isn't a good sign... Journalling should never, if bug-free, lead to data corruption or loss even if there is a power outage.

And, even if there was some bug, one would expect a robust filesystem to say "ah, there is some data corruption here, so we're gonna run an fsck and recover every single file on the disk except perhaps the one or two that the bug clobbered."

1 comments

ZFS and Btrfs have demonstrated devices have a variety of transient failures including maintaining write order implied by fsync or fua.

This will thwart any filesystem.

SSDs do not reliably report UNC read errors when data can't be retrieved. Garbage or zeros are returned instead.

There's a reason why ext4 and XFS added journal and metadata checksumming. Storage devices just aren't as reliable at informing the kernel when it suspects the data returned is bad.

Incorrect write order shouldn't thwart a CoW filesystem. It can check at mount time whether the last few commits are fully there.
If the write order isn't guaranteed you can get a new super block in place without the updated trees being written. The super points to trees that don't exist.

Recent but no longer current trees, can be partly overwritten when the kernel is informed a super block write was successful. But if the super block write wasn't successful (the device lied), the stale super block on disk points to damaged metadata and recoverability isn't certain.

You can tell if the metadata is correct by checking the hashes of everything committed by that superblock.

If it isn't correct, ignore it and move on to the previous superblock. Keep going until you can verify a contiguous 30 seconds of superblocks.

If writes are being delayed by more than 30 seconds, your problems go beyond "out of order".

This does impose the requirement not to overwrite trees that are only a minute or two old. That should not be hard.