Hacker News new | ask | show | jobs
by StillBored 3648 days ago
Others have pointed out the potential HW issues, but I implemented something similar in a product that stored data using its own disk format, and found that adding checksums to all data written to disk yielded a number of cases where what we though was HW failures were actually SW failures. AKA really, really, really obscure bugs that only happened under obscure conditions (think the equivalent of fsck bugs checking the filesystem after power loss for one example, the journal needed to be in exactly the right state to trigger the recovery bug).