Hacker News new | ask | show | jobs
by throwaway43234 2161 days ago
Do you trust the service with the statuspage that is 100% green for all of history or the one that reports the occasional degradation?
1 comments

Reporting accidents isn't the issue. It's having them on the regular and being past the standard operational period that worries me. Not sure why that's contentious.
In an ideal world, you have your threshold for "incident" set low enough that you have a lot of them. Catching more of the distribution of problems let you increase the level of safety.
Exactly. This difference between "we're only going to update the status page if a customer comes screaming to us that the service has been unavailable for the past 24 hours and they've received no updates" and "we have automated systems in place to updated the status page if the p95 response time exceeds XXX (among other things)".