|
|
|
|
|
by jandrewrogers
848 days ago
|
|
ECC isn’t free and ECC has a limited ability to detect all statistically plausible errors. Additionally, error correction in hardware is frequently defined by standards, some of which have backward compatibility requirements that go back decades. This is why, for example, reliable software often uses (quasi-)cryptographic checksums at all I/O boundaries. There is error correction in the hardware but in some parts of the silicon that error correction is weak enough that it is likely to eventually deliver a false negative in large scale systems. None of this is free, and there are both hardware and software solutions for mitigating various categories of risk. It is explicitly modeled as an economics problem i.e. how does the cost of not mitigating a risk, if it materializes, compare to the cost of minimizing or eliminating it. In many cases, the optimal solution is unintuitive, such as computing everything twice or thrice and comparing the results rather than using error correction. |
|