Hacker News new | ask | show | jobs
by pixl97 1603 days ago
It doesn't matter if each individual detects the outage because they'll start blame at the local source and move further up the tree rather than assign blame to a full system failure right off the bat. 99.9% of the time it's going to be a local failure affecting the individual.

Also, most alerting systems like check multiple times before declaring a public outage, many times 2 to 3 failures some seconds apart are needed.

1 comments

To add onto this, my experience is you never want a fully automated status page for another two reasons:

1. External engineers will start to automate recovery/mitigation processes around your status page if it has real time status.

2. You now need to bug test your status page thoroughly because of #1. It basically becomes an actual API.

That sounds like a problem for the external engineers, not for GitLab.
Users make their problems yours.