Hacker News new | ask | show | jobs
by herval 3807 days ago
honest question: have you ever managed to work on a company that's either successful or growing very fast and can guarantee "various forms of testing and QA to ensure that production software does not have critical issues that warrant at 2am call"?

I can imagine that being possible in consultancies or small scale/load products, but honestly never seen it on anything larger than that - including environments with a mind-blowing number of layers of QA and tests...

3 comments

I've worked at a late-stage startup that serves billions of requests per day at peak load, and there was no on-call for developers. I believe the ops team did have an on-call, but that was more at the infrastructure level, as everything was self-hosted at colos.

This was done by having a simple and resilient serving architecture. Every server was stateless. In addition, all the complicated logic was pre-computed into immutable lookup tables, offline. So if that task fails, it doesn't cause downtime, and can wait until the next workday.

We had a robust QA process, but it was far from stultifying.

Absolutely - not everything is a service (yet). It's not sexy, but if you only ship your bits every month or every two weeks then you shouldn't have 2am calls. Everything can wait til tomorrow (unless tomorrow you're shipping, in which case I have seen people still awake at 2am, but I consider that a failure).
I work at a successful avionics company writing software that goes onto airplanes. We do not have critical issues that warrant 2am calls.

I'm in the camp of "will not work at companies requiring on-call or pager-duty."