What was stated seems pretty typical on small teams who manage critical systems. How are critical systems managed otherwise without having to 3x your team (assuming you want 24hr coverage with a 8hr workday)?
>How are critical systems managed otherwise without having to 3x your team (assuming you want 24hr coverage with a 8hr workday)?
They typically aren't, you need to hire more people then. In anything defined as critical I would suggest that being well rested is also a requirement rather than having a team of people who are so overworked they develop mental illness.
Doesn't make sense financially. You're gonna triple the size of your team, just to make sure that this one time of the week someone will restart the DB?
I would much rather bite the bullet, be on call a week a month or so and use the budget for something else.
You don't need to keep the primary development team as primary oncall 24/7. "Hire people to keep an eye on things off-hours, follow playbooks when necessary, and escalate to the right people if they can't resolve" is a reasonable pattern that folks have done for decades the world over.
You get less incentive alignment perhaps, but it really isn't hard to come up with ways to make creating work for another team unappealing.
This is a law to protect the mental and physical well being of citizens and workers, obviously not Amazon's bottom line, that's literally the point. If you treat people like disposable commodities then yes this does not make any sense.
You know what impact the mental and physical well being of workers? Have their company go under because they had to triple the size of the team just to implement an on call policy.
And I'm not talking about Amazon here. Of course Amazon can afford to hire everybody and their mom to cover 24 hours in a day. I'm under the impression that your original comment was really a critic of the practice in general, not just this specific example.
Some small companies build products they try to sell to bigger companies. These bigger companies have SLAs, so they need SLAs from their providers to sign a contract. Even at a small startup on call may be necessary to get customers.
>I'm under the impression that your original comment was really a critic of the practice in general, not just this specific example.
It absolutely was criticism of the practise in general. Yes, small companies that can't ensure that there workers aren't overworked don't really make sense with protections like these. But the same is true for environmental protection.
Critical infrastructure arguably ought to be handled by companies large enough and with enough staff to comply with regulation like this.
For anything else I don't see the issue. Some random smartphone app company doesn't need to fix anything at 3 am in the morning, it can be fixed the next day. That's exactly the culture I'm critizing. 24/7 readiness to work for some random product at the expense of mental health is awful.
I feel like we're just going in circles because you're not addressing any of the examples I'm providing.
I gave you examples of how to implement on call without overworking your employees and even giving them the option to run a few errands here and there. So what's the problem? You seem to be against the practice just for the principle, not even really for any bad effets due to bad implementations.
As for which products are worth implementing "on call" for, yes I agree, some companies are too quick to think that the world can't function 30 minutes without their product.
Anyways, I've seen it implemented successfully at companies with very low attrition and I do believe it is a necessary tool sometimes for small companies to grow and sign big whale customers.
Why are we assuming that Amazon, the third largest corporation in the world by market cap, needs to limit itself to small teams?
They could afford to 3x the teams if they wanted to. They just don't want to, because management prefers to just keep the money and burn out their subordinates. (There's always more where they came from.)
Then that's the cost of managing the service. The only reason companies don't pay that now is because of the significant leverage they have over software development labor. We may be better compensated than many other roles, but we're still just entirely expendable resources.
The only reason companies don't do that is because it doesn't make sense. What are you gonna do with a team that is 3 times the ideal size just to have enough people to cover a 7/24 on call?
It absolutely does make sense if the service is that critical. If it's not that critical, it doesn't need 24/7 on-call. There are some "in between" circumstances, sure, and those might require additional resources or occasionally work outside of normal hours.
It doesn't make financial sense because companies can get away with exploitative behavior because software engineers are quite naive (or even actively detrimental to themselves and their peers) when it comes to labor relations.
So you have 5 engineers building your product. Now you hire 10 more and organize them in 3 shifts. What work do you give to the 10 new engineers? Are you supposed to now hire 200% more PMs, designers, QA, etc and make up some new projects for these guys?
I see this comment again and again on this thread "if it's really critical spend the money", but the thing is even small companies sometimes have critical systems. Simply because they're trying to compete with bigger ones, or they're partnering with companies that do have critical systems and the SLA gets passed down the chain of dependencies.
More members on the team would increase the gap between 2 on call periods for team member (e.g. every 4 weeks instead of 2), it wouldn't get rid of on call entirely.
They typically aren't, you need to hire more people then. In anything defined as critical I would suggest that being well rested is also a requirement rather than having a team of people who are so overworked they develop mental illness.