Hacker News new | ask | show | jobs
by pixl97 981 days ago
Honest status page are a really really hard problem.

If your connector between your status monitor and the service breaks you'll have some subset of users panicking and causing problems (or asking for refunds for outages) when the service was up the entire time.

3rd party services are the only ones that you'll get a "more honest" but not always correct view of what the actual status is.

1 comments

Even if they just updated it manually for major incidents, it would still be useful.
Define major incident. Defines update.
When an incident is declared, have someone tasked with determining customer impact. If the impact radius is greater than a handful of customers, declare a public incident. If customer communication is made a priority, then you can actually have a helpful status page.

Where I work, just about any non-false-alarm incident ends up on the status page in a timely manner. There's nothing stopping the likes of AWS from doing the same except for culture.

Exactly. It's the kind of thing where one person managing it and deciding whether to post updates is probably gonna work better than anything automated.

Well, maybe not at AWS scales though, if they have thousands of everything =/

Why? It's just a judgment call