Hacker News new | ask | show | jobs
by cheald 3658 days ago
> Wait, this seems to ignore the fact that with good change controls and sound code, products do not just fail at 3am.

Because network outages never happen, disks never fail or fill up, memory is never an issue, programs always deal with only the data they were expected to, products never do more traffic than expected, and all infrastructure software ships completely bug-free.

If you aren't occasionally up at 3 AM fixing unexpected outages, then either you haven't deployed a project that requires uptime or you're paying someone else to do it for you.

2 comments

> Because network outages never happen, disks never fail or fill up, memory is never an issue, programs always deal with only the data they were expected to, products never do more traffic than expected, and all infrastructure software ships completely bug-free.

several of these things are exactly the type of things wehre the whole idea of devops just falls apart. If a disk fails what use is a dev vs a good ops person?

The issue is not with problems, but with the arbitrary nature of this as a learning tool. Getting a page to fix an issue that you forsaw and prevented is dumb. Teach them to make resilient systems rather than get out of bed.

I have never gotten up at 3am to fix an outage because outages do not impact systems that widely. If it is a network problem, let the network team do their job. All of your other problems are solved with multiple HA/ load balanced servers, monitoring, and proper testing.