| Another in a long line of unexpected causes for system errors, the classic being the cosmic ray: "When your computer crashes or phone freezes, don't be so quick to blame the manufacturer. Cosmic rays -- or rather the electrically charged particles they generate -- may be your real foe. While harmless to living organisms, a small number of these particles have enough energy to interfere with the operation of the microelectronic circuitry in our personal devices. It's called a single-event upset or SEU. During an SEU, particles alter an individual bit of data stored in a chip's memory. Consequences can be as trivial as altering a single pixel in a photograph or as serious as bringing down a passenger jet." https://www.computerworld.com/article/3171677/computer-crash... Those of us who do high-level development generally treat these underlying systems as infallible, but as we continue to scale - and more money and lives are on the line - we'll need to get used to the idea of not only not trusting the underlying hardware as the article states, but may have to get to the point where we have "ECC at the system level." We already have this in various distributed systems tech, but this tends to be application-specific.
The next step would be to incorporate it directly into datastores generally. This also suggests that heterogeneous hardware architectures can have an advantage in situations where data integrity is critical, even with the increased administration, hardware, and ops costs. Finally, it also highlights the importance of data audits and reconciliations for even non-suspect data on a regular basis, preferably with the aforementioned heterogeneous setup. |