Hacker News new | ask | show | jobs
by caeril 1655 days ago
No personal experience other than a slightly different experience running production services (involving money!) on another box without ECC DRAM (to save money!) and experiencing random permission flags flips and actual balance/amount flips. Only a small handful over many years, but it does happen, and when it matters, it REALLY matters.

My advice is to always use ECC DRAM in production unless you're serving cat photos, porn, social media posts, or other societally useless applications. For anything that actually matters, please use ECC.

1 comments

Yes this is one concern. Are you sure it was a result of using non ECC mem and how did you find out it was because of that?
We could never be absolutely sure, due to the true Heisenbug nature of the behavior, but after tons of code audits and the observation after reverse proxy traffic analysis that it only occurred on processing by the non-ECC hosts, and never on the ECC hosts, that it was the most likely culprit.

The fact that the errors were single bit errors also strongly pointed in that direction.