Hacker News new | ask | show | jobs
by johnmarcus 1607 days ago
Yup, so true. People think redundant == 100% uptime, or that when they advertise 99.9% uptime, it's the same thing as 100% minus a tiny bit for "glitches".

It's not. .1% of 36524 = 87.6 hours of downtime - that's over 3 days of complete downtime every year!

For a more complete list of their SLA's for every service: https://aws.amazon.com/legal/service-level-agreements/?aws-s...

They only refund 100% when they fall below 95% of availability! 95-99= 30%. I believe the real target is above 99.9% though, as that results in 0 refund to the customer. What that means is, 3 days of downtime is acceptable!

Alternatively, you can return to your own datacenter and find out first hand that it's not particularly as easy to deliver that as you may think. You too will have power outages, network provider disruptions, and the occasional "oh shit, did someone just kick that power cord out?" or complete disk array meltdowns.

Anywho, they have a lot more room in their published SLA's than you think.

Edit: as someone correctly pointed out i did a typo in my math. it is only ~9 hours of aloted downtime. Keeping in mind that this is per service though - meaning each service can have a different 9 hours of downtime before they need to pay out 10% of that one service. I still stand by my statement thier SLA's have a lot of wiggle room that people should take more seriously.

3 comments

As someone else said, your math is off. Your point is still reasonable, though.

The uptime.is website is a handy resource for these calculations. For example, http://uptime.is/99.9 says

"SLA level of 99.9 % uptime/availability results in the following periods of allowed downtime/unavailability:

    Daily: 1m 26s
    Weekly: 10m 4s
    Monthly: 43m 49s
    Quarterly: 2h 11m 29s
    Yearly: 8h 45m 56s"
Your computation is incorrect, 3 days out of 365 is 1% of downtime, not 0.1%. I believe your error stems from reporting .1% as 0.1. Indeed:

0.001 (.1%) * 8760 (365d*24h) = 8.76h

Alternatively, the common industry standard in infrastructure (the place I work at at least,) is 4 nines, so 99.99% availability, which is around 52 mins a year or 4 mins a month iirc. There's not as much room as you'd think! :)

> Yup, so true. People think redundant == 100% uptime, or that when they advertise 99.9% uptime, it's the same thing as 100% minus a tiny bit for "glitches".

Maybe this is the problem. 99.9% isn't being used by AWS the way people use it in conversation; it has a definite meaning, and they'll refund you based on that definite meaning.