Hacker News new | ask | show | jobs
by Dylan16807 2564 days ago
I don't understand. At that rate your chance of having a second bit flip before the first is fixed is almost zero, and the chance of hitting two separate bit flips in the same row is ridiculously small.

If you're worried about a single event causing two flips in a single row... I suppose that's possible, but it could also cause three bit flips. So a Xeon has a non-zero error rate. Is Ryzen meaningfully worse?

1 comments

I think his argument is that you will get bit flips, ECC is just going to report and/or correct them. Without it, your hoping the bit flips show up somewhere you can detect them (application crash/etc) rather than silently chugging along and ruining your results/data/whatever.

I had the chance a long time ago to work on a product that as a side effect was corrupting system memory... Think of it as a kernel module that picks a random number between 0 and MAX_RAM and flips a byte. Its truly amazing how many of those can happen before there is any visible evidence something is wrong.

You're talking about ECC vs. no ECC. That's not what the comment was saying, it was saying it handled single bit flips correctly but not double bit flips. But at 1 bit flip per GB per year, randomly distributed, you are guaranteed many single bit flips but a double bit flip is almost never going to occur.