Hacker News new | ask | show | jobs
by notacoward 1540 days ago
> But it can detect that the case when it can't recover.

That is simply not true. For any parity/ECC/FEC/erasure-code scheme carrying M data bits in N (greater than M but less than 2M) total, there must be multiple data patterns that will match the same error checks. That's just mathematics. Also, bear in mind that ECC bits can be corrupted too. This opens up the distinct possibility of something that looks like a correctable error, but the "correction" leads to a wrong result. I've seen such issues in many kinds of storage systems, from low level to high. Anyone who has actually worked in this area, instead of deriving their "expertise" from a quick scan of Wikipedia, would be utterly unsurprised by the idea that disk firmware might do such a thing, or have bugs in their ECC implementation, or not follow a spec.

Whatever the causes, whatever the merely-theoretical probabilities, the fact remains that I've seen these. I've been paged for them. I've done the analyses of possible causes. A bit pattern was written and repeatedly verified over a quite long period of time (ruling out data path issues), then at some point a different bit pattern was read and would persistently be read thereafter. How is that not real bitrot? How does it matter, beyond ruling out everything above the disk level, what the precise causes are? If you can't answer those questions, you're just posting noise.

1 comments

> That is simply not true.

It indeed is not. Had to reread the theory and I stand corrected, RS-style ECC can't detect errors in excess of the redundancy count.

> How is that not real bitrot?

It is and I can see how it can happen.

> How does it matter, beyond ruling out everything above the disk level, what the precise causes are?

It would've mattered if a drive could detect on-disk bitrot reliably, which was what the stats I worked with (also in exabytes, funnily enough) and the IEEE papers I read led me to believe.

For what it's worth, you won. Hats off.

Thank you for an interesting (despite being contentious) conversation.