Hacker News new | ask | show | jobs
by lll-o-lll 205 days ago
Yep. Non-determinism. Back in the day it was memory corruption caused by some race condition. By the time things have gone pop, you’re too far from the proximate cause to have useful logs or dumps.

“Happens only once every 100k runs? Won’t fix”. That works until it doesn’t, then they come looking for the poor bastard that never fixes a bug in 2 days.

2 comments

My first job was as an RF (microwave) bench technician. My initial schooling was at a trade school for electronic technicians.

It was all about fixing bugs; often, terrifying ones.

That background came in handy, once I got into software.

I started life as an engineer. Try reverse engineering why an electrical device your company designed (industrial setting, so big power), occasionally and I mean, really really rarely, just explodes; burying its cover housing half way through the opposite wall.

Won’t fix doesn’t get accepted so well. Trying to work out what the hell happened from the charred remains isn’t so easy either.

Sounds like some great stories.
The worst bug in my career was when the app would reliably crash if you left it running for "long enough" - but still non-probabilistically, so sometimes it would happen in an hour, sometimes in three. The crash itself was quickly diagnosed as a corrupt vtable, but finding the piece of code that had a pointer bug in it that just happened to write into (some) object's vtable in certain situations that triggered a race condition took many days.