Hacker News new | ask | show | jobs
by dataflow 900 days ago
Does one actually manage to avoid such outages for 10 years by making the problem recur every month? 'cause I feel like stuff would still break even if you test and run them regularly.
1 comments

You might hit an outage, but you'll hit it within a month of deploying the new code that caused it, so you'll have the context and staffing expertise to fix it so it doesn't happen next month. Whereas if the outage happens in ten years, you'll need some software archaeologists to find the root cause and likely won't have the expertise available to fix it.

And maybe you say "it's one outage either way, but isn't it better in ten years than next month?" But when you're constantly adding new services, eventually there will come a time where every month some new service is having its ten year anniversary.