Hacker News new | ask | show | jobs
by manfre 521 days ago
Status pages are rarely honest. The company will lie to salvage their SLA. "Degraded performance" or "some customers are experiencing an elevated error rate" should be interpreted as "service unavailable / outage"
4 comments

Someone else lamenting the delayed status page updates almost 2 years ago: https://news.ycombinator.com/item?id=35887213
I'd be curious by how much they downplay downtime. Wouldn't be too hard to put together an honest status page that pulls & pushes something new to main every 5 minutes, creates an issue, comments on it etc. Very basic high level checks.
PMs would never let an automated system make the company look bad, nor would they let engineers have time to build such a system.
I was thinking of a guerilla uptime monitor here, not one maintained by GitHub but independently.
I'm not convinced it's deliberate dishonesty. Just a communication disconnect. Firstly, it can take time from the first yellow flags to the full realization that there really is an incident underway, secondly it needs someone to decide how to communicate that incident, and thirdly the engineers who are actually working on the incident need to be able to get on with it instead of being pestered for an update every 10 minutes.
They used to be. Github's is a prime example of how "useful info" has turned into "PR mouthpiece" — it used to display graphs of a few choice Github system metrics, and those spiking could often usefully indicate "yeah, they're having a problem" well before a human could update the page.

But yeah, also status pages seem to be under the domain of non-engineers who are too concerned with how things look, vs. conveying useful information in a timely manner, and ultimately, fail at both.