Hacker News new | ask | show | jobs
by tenfingers 3975 days ago
Cannot upvote enough. At this density, and considering caches will have persistence of days (if not more) in the main memory, random errors become a major issue. ECC should have became a standard in system memory 5 years ago already.
1 comments

Wait is that true? There is no error correction in system memory? That sounds like a huge waste. Usually with coding not only reliability increases but you can decrease power consumption a lot too (so you're not "fighting the noise" with power only).
Yes, exactly true. Some libraries (like innodb) actually checksum all contents in memory to self-detect when corruption is happening. Most applications and libraries trust system memory too much and can read corrupted data at any given time.

Another fun way to get corrupted memory is to have some data swapped to disk, but have disk corruption, then have the corrupt disk data restored to memory. Bam. Instant invalid data in memory, but not caused by memory.

Many hashes are fast these days and should be used as checksums in more places since ECC isn't as common as it should be (and ECC doesn't detect all errors anyway).