|
|
|
|
|
by bob1029
2161 days ago
|
|
The whole point of putting engineers on call (in this context) is to encourage them to make good technology choices and to take some ownership for their products. If there was no counter-pressure with pager duty or other threats to peaceful existence, then most developers would just pick whatever technology they personally enjoy using the most and expect that someone else will fix their special shitpile for them at 3am. Someone is always going to get screwed in this equation, at least make it an equitable screwing. Being on-call doesn't just apply to code either. Would you be OK if no one tried to fix your broken water pipes or electricity until the following business day? Do we turn off the global internet at bed time? At some point people are going to have to do shitty work to keep this world running. The best you can do is rotate the shitty work around so that everyone can help out. Automate what you can, share the load for what you cannot. If everyone does their part, it is a lot less painful all around. |
|
That assumes engineers are even empowered to make technology choices. At many companies they are not (whether by dint of organizational structure or the roadmap not allowing a major technology shift from whatever "shitpile" you and your team have inherited).
Having clear escalation strategies (and knowing when escalation to the original engineers behind a project is even appropriate) is often lacking. I wouldn't want to call engineers in at 3am for a problem that can be fixed by following a documented devops process. Plus - what happens when the engineer you need to reach is unavailable? They are sick, or don't wake up, or their phone died?
What happens when business pressure says "we're ok with calling engineers twice a week as long as the roadmap moves"?
"You built it you're on call" is a fragile way to handle problems in more ways than one.
Which isn't to say there shouldn't be shared responsibility. Of course there should. But responsibility without power is toxic. At the very least it increases flight risk - but in practice often has a far wider reaching deleterious effect than just that.