Hacker News new | ask | show | jobs
by patmcc 4417 days ago
Their uptime calculation is...interesting, to be polite. Look at Dec 2013 or Sep 2013 and you'll see major incidents covering most of a day, but still three nines. How? See https://devcenter.heroku.com/articles/heroku-status#uptime-c... , but basically, if 40% of running applications are down for 10 minutes they count it as 4 minutes outage.

If 40% of your applications are down for 10 minutes, that's 10 minutes of downtime.

2 comments

That sounds pretty reasonable, actually. Sure, you can make the case that if 40% of the apps are down, then that's a whole-service downtime, and should be counted as such. But what if it's 4%? Or 1%, or 20 apps? Should a downtime that took down 20 apps be counted at the same level as one that took down 80% of all of their apps?

Seems like the best compromise is exactly what they did. The numbers are about as relevant to people looking to serve their applications as they could be without breaking things down by availability zone or stack or some other metric.

It's not completely backwards, and I'm glad they're reasonably upfront about it. I think given their business, I'd prefer a range or histogram - "uptime was between 96.4% and 99.99% depending on the customer" or "uptime was 96% for 10% of customers and 99.8% for 90%" or something.