| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Gys 462 days ago
	> their durability number means that one can expect to loose data about once every 10,000 years What does that mean? If I have 1 million objects, I loose 100 per year?

3 comments

lukevp 462 days ago

What it means is in any given year, you have a 1 in 10,000 chance that a data loss event occurs. It doesn’t stack like that.

If you had light bulbs that lasted 1,000 hrs on average, and you had 10k light bulbs, and turned them all on at once, then they would all last 1,000 hours on average. Some would die earlier and some later, but the top line number does not tell you anything about the distribution, only the average (mean). That’s what MTTF is; the mean time for a given part to where it has a greater likelihood to have failed by then vs not. It doesn’t tell you if the distribution of light bulbs burning out is 10 hrs or 500 hrs wide. it’s the latter, you’ll start seeing bulbs out within 750 hrs, but if the former it’d be 995 hrs before anything burned out.

link

8organicbits 462 days ago

Isn't it just a marketing number? I didn't think durability was part of the S3 SLA, for example.

link

TheNewsIsHere 461 days ago

Object integrity isn’t part of the S3 SLA. I assume that is mostly because object integrity is something AWS can’t know about per se.

You could unknowingly upload a corrupted file, for example. By the time you discover that, there may not be a clear record of operations on that object. (Yes, you can record S3 data plane events but that’s not the point.)

Only the customer would know if their data is intact, and only the customer can ensure that.

The best S3 (or any storage system) can do is say “this is exactly what was uploaded”.

And you can overwrite files in S3 with the appropriate privileges. S3 will do what you ask if you have the proper credentials.

Otherwise, S3 is designed to be self-healing with erasure encoding and storing copies in at least two data centers per region.

link

catlifeonmars 461 days ago

S3 supports checksumming, you just need to provide a hash in a header when you upload an object.

link

TheNewsIsHere 459 days ago

Yes but my point stands. If AWS added S3 data integrity to the SLA then it’s now made that commitment contractually. If you add checksum data the checksums would (logically) be required and also be in scope of the SLA. If there was a mismatch between them and the file functioned it would be impossible to sanely adjudicate who is responsible for the discrepancy, or what the nature of that discrepancy might be if no other copies of the file exist.

AWS probably doesn’t want those risks and ambiguities.

link

ceejayoz 462 days ago

Amazon claims 99.999999999% durability.

If you have ten million objects, you should lose one every 10k years or so.

link

graemep 462 days ago

How does that compare to competitors and things like distributed file systems?

link

huntaub 462 days ago

I generally see object storage systems advertise 11 9s of availability. You would usually see a commercial distributed file system (obviously stuff like Ceph and Lustre will depend on your specific configuration) advertise less (to trade off performance for durability).

link

gamegoblin 462 days ago

In general if you actually do the erasure coding math, almost all distributed storage systems that use erasure coding will have waaaaay more than 11 9s of theoretical durability

S3's original implementation might have only had 11 9s, and it just doesn't make sense to keep updating this number, beyond a certain point it's just meaningless

Like "we have 20 nines" "oh yeah, well we have 30 nines!"

To give an example of why this is the case, if you go from a 10:20 sharding scheme to a 20:40 sharding scheme, your storage overhead is roughly the same (2x), but you have doubled the number of nines

So it's quite easy to get a ton of theoretical 9s with erasure coding

link

toolslive 462 days ago

it's really not that impressive, but you have to use erasure coding (chop the data D in X parts, use these to generate Y extra pieces, and store all X+Y of them) iso replication (store D n times)

link