Hacker News new | ask | show | jobs
by t-writescode 1990 days ago
> That is the typical on-call experience, getting woken up for 15-30 minutes each night, cortisol from 0 to 100 in the 15 seconds it takes to get into Work Mode.

The only company I ever had that happen with was the big company. The other two companies I've worked with that had on-call experiences, if anything like that happened, we would be tweaking alarm levels so it didn't happen anymore.

If you're not tweaking alarm levels or fixing code to clear out false alarms, it's not a sustainable on-call rotation and that needs to be fixed immediately.

I've been the solitary on-call for the main service of a company before and I almost never got called because 1) we had good KB articles for the operations center for when things did break; and, 2) things very rarely broke in a way that wasn't automatically fixable

It's amazing how many cases "remove broken machine from pool automatically and then restart service and bring that machine back on service crash" is a valid fix for the weird, extra edge case junk that would otherwise be a call.

1 comments

I've experienced this at small startups and BigCorps. Granted, at the BigCorps fewer things blew up in general, and when they did, it was interesting.