| > With this on-call culture thanks to the 99.99 uptime thats become the de-facto industry standard for most companies, I wonder whether such companies exist anymore ! A lot of companies have terrible on-call processes, bad infrastructure and not enough cost-pressure to fix the infra. We do offer systems with 24/7 99.9 uptime in SLAs - yet, people are happy to take on-call for a week. They do, because a normal week has zero, or one incidents going on out of hours. Two is considered a lot. At three interrupted sleeps, rotation is offered. At four, rotation is forced and at that point, we usually start to go through full post mortems for all pages to squash this problem. On top, a long escalation at night means the person is legally not allowed to work usually until noon the next day or longer (10 - 12 hours of mandatory rest). The latter alone is a very good incentive for the company to fix technical issues or processes causing these incidents, because all of a sudden, a person pretty much isn't working for a week. Besides them being effectively drunk after 3 interrupted sleeps, just without the fun. And similarly, evening work is usually planned 1-2 weeks ahead so we can batch 2-3 hours of evening work together. The person then in turn doesn't work the regular day. With measures like that, even us as the ops-team can have normal working hours from 8-10 to 16-18 usually. Unplanned work after 20:00 is very rare. |
I felt like I dodged a bullet.