Hacker News new | ask | show | jobs
by Iggy173 2559 days ago
Also, accurately reporting about an arbitrary source of downtime means you're smart enough to avoid the same sources of downtime.

Not that this can't have been an obvious reason (deleting all the servers in a datacenter or similarly trivial but severe) but it's likely impossible to ensure status page accuracy.

1 comments

You could just ping the servers once a minute and tell if they're up or not. No need to know why they've gone down.
That only indicate the frontend of the service is up and potentially running. Being about to respond to ping and being able to serve HTTP request are two different things, and being able to serve HTTP request vs a fully functional website are two different things. Think about wrong SSL certification, wrong domain mapping between frontend/backend, broken JS/CSS etc.
This outage is a great example. I can ping Google Calendar servers and I get an HTTP response. SSL also works like a charm.

And yet everybody agrees it's down.

Most outages aren't so obvious as this one, and any ping will fail intermittently (often because the ping agent has a failure.) Google definitely has loads of pings hitting Google Calendar in various ways. Exposing this monitoring to the public is not practical or really useful. (And would aid would-be attackers.)