Hacker News new | ask | show | jobs
by dboreham 960 days ago
No. Random bitflips (aka hardware that doesn't work) are a relatively new thing. Bit flips due to buggy software was a thing though. This is why most database engines checksum the payload data even in memory. I've also seen network packets corrupted because a bridge (former name for switch) trashed data in flight, then reconstructed its CRC for the corrupt data on the onward leg.
8 comments

>Random bitflips (aka hardware that doesn't work) are a relatively new thing

This implies that old hardware always worked, which I strongly doubt (what year did hardware go from always working to not?).

1993
I beg to differ. Early 90s there were some Amiga memory expansions that would constantly flip bits. I'm pretty sure it contributed to the sentiment that the system wasn't the most stable, although I'm pretty sure one or two of my friends with PCs saw similar issues on their machines. Maybe Microsoft Word wasn't to blame for all the crashes?

Of course, trying to work around it in software is utterly futile.

Bitflips aren’t a new thing. I’ve been rarely but painfully bit by them since at least 1986. This. Excludes serial and modem communications where it was a way of life.
SEE/SEU are not a relatively new thing. However, the frequency of events is inversely proportional to the feature size, which has been decreasing over time.

https://en.wikipedia.org/wiki/Single-event_upset

https://en.wikipedia.org/wiki/Die_shrink

> Random bitflips (aka hardware that doesn't work) are a relatively new thing.

I thought they (from cosmic rays, etc.) were always a thing, but so rare that you needed a very large system (in scope or time or both) to have a substantial chance of encountering one (outside of noisy comm channels, which use error correction protocols for exactly that reason.)

Some event (unknown) triggered multiple spikes in the "tell me three times" redundant three ADIRU units of Qantas Flight 72 causing a WTF unscheduled sudden and dramatic pitch down

https://en.wikipedia.org/wiki/Qantas_Flight_72#Conclusion

Cosmic rays were suspected but unconfirmed (kind of hard to confirm after the fact).

"All the aircaft in the world" for sixty years is kind of a large system given that currently there are on the order of one million people in the air at any moment.

There are lot‘s of thing that can go wrong beyond cosmic rays. Like timing on the bus or signals from close wires. Digital is an abstraction of an analog and chaotic reality.
> Random bitflips (aka hardware that doesn't work) are a relatively new thing.

Bits flipping due to hardware that didn’t work well was what caused Xerox PARC to implement an error correcting memory for MAXC, fifty years ago.

https://gunkies.org/wiki/Maxc

> Bit flips due to buggy software was a thing though.

No kidding.

What you mean by relatively new? I observed bitflips a couple decades ago, causing machine to panic.
Panics and consequent crashes or reboots (?) used to happen in Unixen at times, maybe due to bitflips or other hardware errors.
Cosmic rays ?