Hacker News new | ask | show | jobs
by bravetraveler 630 days ago
I've always seen it as a 'mode of operation' for a time period. Same schedule/timing unless something bad happens. Then you're the one to be woken up/disturbed. Outside of that... you're generally free to whatever maintenance, process, or feature work.

This is helpful when the incidents are less 'something to revert'... and more something to do or completely remove. If CICD relies on things on the internet for example, deploying caches to remove a laundry list of potential snags.

On call is a bit bipolar as a result. Either comfortably wandering around looking for something worth working on, or knowing what it is - dashing to put out flames! It's not sustainable so we all take turns.

I believe a poster above was correct with their intuition. I feel there's a broken/missing feedback loop. Regular incidents happen, but they shouldn't be constant. The goal should be to eradicate them, accepting a downward trend