Hacker News new | ask | show | jobs
by dripton 3816 days ago
I work at a big famous company that strongly believes in devops.

I'm on call about one week in 6. (About 12 people in the rotation, with two on-call at a time.)

When on call, the ticket queue is supposed to have priority, but managers have an unfortunate tendency to say "that guy's on call so he's not doing anything so let's have him do it." So what ends up happening is that high-severity tickets get priority, then whatever random stuff the manager wants done, then low-severity tickets. So low-severity tickets tend to remain open for way too long because the oncalls aren't making them a priority.

For high-severity tickets, we escalate if we can't fix things quickly. First bring in the other oncall, then bring in the manager, then bring in anyone else on the team who can help.

When a dependency breaks, we can ticket its owners to fix it. If it gives us a high-severity ticket, then they get a high-severity ticket too. Ultimately there's a tradeoff between using fewer dependencies in the name of controlling your own fate, versus using more dependencies to avoid reinventing the wheel. Each team has to find that balance for itself.

I don't enjoy being on call, but I think it's the best solution we have. Stuff breaks and someone has to fix it. If you don't pick someone, then everyone has to fix it. If the same people who build something also have to fix it, they have strong incentives to do things correctly.