Hacker News new | ask | show | jobs
by Ozzie_osman 106 days ago
The worst bug I ever dealt with in a 20 year career was a leap second bug (back in 2012). Servers all slowed down dramatically very suddenly, CPU saturated. No relevant code changes or changes in traffic. Turns out, they just got into that state due to a leap second. Some Livelock bug.

A restart fixed everything.

It wasn't just our site that went down. If I recall correctly, many other large sites (like Reddit, LinkedIn, etc) also had the same issue. Guess no one thought of the "did you try restarting it?"

3 comments

Yep, I was there too! HN thread from the time: https://news.ycombinator.com/item?id=4188412
Me too! In my case postfix locked up and stopped sending mail there was a massive queue. I checked the logs and saw the same second twice and that's when I learned about leap seconds. Since then I have a reminder in my calendar every 6 months to check if ones been announced. Thankfully we've only had two.
I remember that well (because my manager at the time was asking me afterwards "why were you up at 2 in the morning restarting services?" and didn't believe my answer :( )