Hacker News new | ask | show | jobs
by nick3443 631 days ago
"on call" shouldn't be an additional shift to have the employee at their desk. It's an emergency service with a defined SLA (acknowledge pager within X time, review issue and triage or escalate within Y time. Work on issue until service is restored/bug is rolled back (but not necessarily to the point of completing a long term fix)
2 comments

This depends. There are several on-call paradigms.

In 2 of the 3 companies I've worked that have on-call, the On Call rotation has been a "the totality of your duties are being on call for [X] duration". There are no features to push, there is Op X and tickets of varying priority levels.

I've always seen it as a 'mode of operation' for a time period. Same schedule/timing unless something bad happens. Then you're the one to be woken up/disturbed. Outside of that... you're generally free to whatever maintenance, process, or feature work.

This is helpful when the incidents are less 'something to revert'... and more something to do or completely remove. If CICD relies on things on the internet for example, deploying caches to remove a laundry list of potential snags.

On call is a bit bipolar as a result. Either comfortably wandering around looking for something worth working on, or knowing what it is - dashing to put out flames! It's not sustainable so we all take turns.

I believe a poster above was correct with their intuition. I feel there's a broken/missing feedback loop. Regular incidents happen, but they shouldn't be constant. The goal should be to eradicate them, accepting a downward trend