Hacker News new | ask | show | jobs
by Nerada 765 days ago
DDR5 comes with on-die ECC. My understanding is this only checks errors occuring within the RAM itself, not errors that occur during transmission to and from RAM.

My question is, how common are transmission errors over errors happening within RAM?

4 comments

On-die ECC is so they can give you a memory array with a few faults. It's a yield enhancement not an introduction of ECC as you think of it.

Adding protocol-level ECC on top only helps, although it is somewhat inefficient.

Similar to SSDs, which are constantly switching to less and less reliable cells for density and now need fault correction built in to function at all.
Another problem with on-die ECC is the lack of reporting.

You have no idea if you have tons of errors and how many were corrected.

Sort of. It's not the same as extended ECC like ChipKill.

https://en.wikipedia.org/wiki/Chipkill

DRAM Errors in the Wild: A Large-Scale Field Study (2009)

https://static.googleusercontent.com/media/research.google.c...

LPDDR4/4X has also had on-die ECC for a while (at least the chips I'm used to, like in the Raspberry P); with such small lithography it's basically required to get the ram to work reliably.