Hacker News new | ask | show | jobs
by m463 901 days ago
I remember there being a weird clock rollover bug that only financial firms would hit (since they never took their machines down, ever)

That was a long time ago. I wonder if technology/the cloud has changed or they still run those same machines

3 comments

30 years ago companies were rebooting their mainframes twice a year just to make sure. Before doing that companies were burned because the mainframe went down accidentally (backup generator broke during a power outage) and they couldn't get it to start because someone changed a setting at runtime but didn't save the setting to the boot scripts - then that person retired or found a new job. By rebooting twice a year they were able to ensure the someone remembered what setting was changed when the system failed to start.
Chaos Engineering!

Untested emergency plans are not a guarantee that the plans will work.

One of the things that I loved about ISO9001, sure, it made every sysadmin action something that made police paperwork look 'light', but it ensured you didn't hit this kind of thing, or if you did, it was an instant gross negligence dismissal on whoever stopped documenting or following the documented procedural protocol.
Financial firms will also hit time-based bugs before most organizations because they often deal with forecasting events 30+ years in the future (e.g. mortgages). For a bank, the 2038 rollover has been relevant since 2008.
I hit one of these on an EMC VNX array one time; after ~400 days all the controllers crashed at the same time. Didn't help that it happened at 4am on New Year's Day. I do recall other instances of this class of bug, but nothing specific.