Hacker News new | ask | show | jobs
by Octabrain 913 days ago
On call sucks so badly. At this point of my life, I firmly believe that there's not enough amount of money that can compensate the mental suffering it implies. Even more if the company you work for has this mentality of "deal with it" without making improvements, which was my case in the last period I did on call and what made the camel's back to break for me. Nowadays I simply refuse it. For those who are still on the trenches, stay strong, never resígnate yourself to just "deal with it" and thank you.
4 comments

Quote From a classic:

You might be under the impression that what makes you qualified for various positions in software development is primarily your technical acumen and ability to work with other technically-capable engineers.

You’d be wrong.

While a certain minimum of capability is required to do your day-to-day work, what your value really consists of is in grinding yourself against the piercing pincers of elusive bugs and razor-wire bundles of bullshit code until something resembling progress is made. You are not a problem-solver, you are a problem-endurer.

https://web.archive.org/web/20160317234837/https://medium.co... -> Point 4

To me, the worst part of being on call is the stress _after_ my shift ends. I understand that it's a necessary part of the job to fix issues that occur during my shift, so I don't really mind it, but it gives me long term issues. I feel anxious whenever I don't have my phone on me, or when I'm far enough into the wilderness to lose my cell signal. Late night when I don't expect to be getting messages from anyone, a random notification can sometimes give me an immediate stomach-drop panic response.

Unfortunately I feel like I lucked into this role and if I left I wouldn't be able to find anything anywhere near as good.

And I am not even sure whether you are talking about just day-time on-call or the 24 hours on-call for at least 1 full week to two week stretches or a simple 12 hours on-call you are talking about? In India the Indian managers (and American managers are just fine with it) have made an environment of this barbaric practice of 24x7 on-call handled by just one person.

In fact, even when there are US/western counterparts these subhumans projects that they will make sure Indian engineers are on-call even during American daytime. This has been happening at my workplace. They employ all tactics - from fear, intimidation, to try to sweat talk engineers into it with shit like, "Oh, we own it, right? So it's our responsibly to support even when it's night".

With that environment it becomes extremely difficult and a pressurised situation for someone like me who simply refuse to even sign up on something like PagerDuty and make it clear that my phone remains silenced and out of my bedroom between 10pm-7am and it really does.

I agree with you - there is no amount of money that can put on on-call, definitely not on a night shift on-call.

> have made an environment of this barbaric practice of 24x7 on-call handled by just one person.

If it makes you feel any better this is very common in small to mid-sized US tech companies as well. In every team I've been on that had an oncall rotation it was a full week 24/7 per person, that rotated among team members. Even at Google we were on call for our own service overnight and didn't have SRE / other time zone oncalls.

But the number of pages and other work varied significantly between teams. The worst was risk at Square in 2016, where we routinely got paged 40+ times a week (mostly noise) and when real incidents were most likely on Saturday morning. The best was Instant Apps at Google where we got a ~$5k bonus for each week of overnight oncall and almost never got a single page.

Why would that made me feel better, tdeck? It doesn't make me feel better.

Besides it’s different what you mentioned about where you are from and what I experience and see as the norm where I am from.

With the last (and only) job that required me to be on call I quit the day before I was scheduled. I've always refused to do it. Devs have no business doing it.
I appreciate setting boundaries but I don't really understand this attitude. Frequently on call issues are caused by problems with the application logic, therefore solving them requires an understanding of the code. It's not usually my experience that oncall issues are a simple case of force-restarting something or provisioning more boxes, although that can happen from time to time.
A system that can get itself into a non-functioning state and that can't be supported by an operator or dedicated support person is fundamentally broken and should not be in production. In my view devs should never have access to production, under any circumstances, ever.

This is an artifact of devs (and others) not knowing what they're doing, and just hacking and hoping for the best. It's really not that hard to develop a system that is reliable and supportable in a basic way. Understanding the code shouldn't be a requirement, but understanding the system should, and that's a requirement of support personnel. Put another way, the functional model of the system has to be at a higher level than the code.

I'd argue that a software system that can be supported by a dedicated operator who isn't a developer is fundamentally broken. Any response protocol that can be handled by someone without familiarity with the system's internal workings can fundamentally be automated. Scaling hardware? Restarting boxes? Ignoring and silencing an alert? Draining traffic to a bad host? These are all fairly simple actions that could theoretically be automated, and have been automated at many companies. There shouldn't be a need for a person who can only do things like this.

On the other hand, there will be production issues caused by a complex interaction within the system that arises in an unforeseen edge case. These issues frequently require a code change which requires the ability to understand the codebase. In that case, the system is broken, but it's not "fundamentally" broken, it's broken for a particular edge case. Unfortunately, we may not have the luxury of waiting until 10 AM PST to start looking into the problem and coming up with a fix for it.

> Devs have no business doing it.

Agree but, I have to say that, as a DevOps, it was infuriating to me to have to deal with developers without any care for the quality of what they were delivering. Sometimes for pressure from someone higher in the chain, other times, for pure laziness and/or incompetence. I remember coming in the morning after a hell of a night on the on-call, reporting the issues to the Devs in charge and being answered something along the lines of "fixing that is not the priority right now" and my replying on anger with "If it was your damn phone the one ringing during the whole night I'm pretty sure you would make it a priority".

There should be some sort of trigger whereby over a certain threshold of problems the devs have to perform the support role. It's unacceptable to deliver a shitty system and rely on support to avert disaster or user revolt, there has to be some sort of incentive to counter this.