Hacker News new | ask | show | jobs
by e12e 3633 days ago
>> Was shocked to see that your durability (27 9s) was so much higher than what S3 claims (11 9s)

> 27 9s is literally higher than my confidence that human civilization will be here in the next second. 10^27 seconds is about 32 quintillion years. Extinction events occur with a much higher frequency than that.

Without reading the article, I'm assuming this is the uptime probability they promise - so ( 1.0/10^9 ) * 365.25 * 24 * 3600 * 300 ~ 1 second of unavailability every 30 years (AFAIK Amazon is pretty far in the "red" on this one - they've had a few outages?) [ed: initially was off by a factor of 100 (and 10 in error...), the two 9s before the comma - the "compliment" (1 - p) of 11 9s is 1/10^9, not 1/10^11)].

11 9s is already effectively the same as will "never go down in a way customers will notice" (actually a second every 30 years per region isn't entirely insignificant, only almost insignificant). A higher guarantee does indeed seem silly. It's probably much more likely that we'll see annihilation by global thermonuclear war (for example). In which case I'm guessing the data would go off line for a while -- so such a number is meaningless.

1 comments

Thats not what they are promising, that would be insane. Please just read the article.
Point well taken - I was confused by the comments about how long 10^27 seconds is, and the "9s" nomenclature which is often used to measure uptime.

On the other hand, a Dropbox engineer upthread just claimed that their service have an external [ed2: durability] of 11 to 12 9s. So it does seem that they effectively claim that a block will practically never be unavailable due to not being possible to read from (any) disk [ed2: ie, unavailable due to failed durability]?

I do wonder a bit at the cost of padding redundancy up to such a high number. They don't mention block size, other than to say that 1 GB is filled with actual blocks. Lets say it's 1MB, and they target 1 Billion users, averaging 100 GB of data stored. That's 10⁹ users storing 10² GB each with 10³ blocks, or 10⁹⁺²⁺³ = 10¹⁵ blocks of data. That still leaves a lot of margin - and effectively the storage part of the system should never be the weakest link.

[ed: and that some users are likely to lose some data, if the 11 to 12 9s figure is to be taken "per block". But maybe it's per user? It seems unlikely that they really mean 11 9s of availability full stop]

[ed2: Snuck in an extra availability where I meant durability, rendering also this second comment nonsensical...]

You are confusing availability ("uptime") with durability ("losing data").

I could make a system that was only available for 1 minute every 10 minutes (10% availability) but never lost a single file (100% durability)

I could also make a system that was never down (100% availability) but would randomly lose 1 in 10 files (90% durability).

Common causes for availability issues are power outages, network outages (fiber cuts, etc), and DNS issues

Common causes for durability issues are software bugs, entire datacenter losses (e.g. earthquake destroys everything), or perhaps coordinated power outages if the system buffers things in volatile memory.