Hacker News new | ask | show | jobs
by sjsdaiuasgdia 462 days ago
That's going to depend a lot on what your needs are, particularly in terms of redundancy and durability. S3 takes care of a lot of that for you.

One server with a RAID array can survive, usually, 1 or maybe 2 drive failures. The remaining drives in the array will have to do more work when a failed drive is replaced and data is copied to the new array member. This sometimes leads to additional failures before replacement completes, because all the drives in the array are probably all the same model bought at the same time and thus have similar manufacturing quality and materials. This is part of why it's generally said that RAID != backup.

You can make a local backup to something like another server with its own storage, external drives, or tape storage. Capacity, recovery time, and cost varies a lot across the available options here. Now you're protected against the original server failing, but you're not protected against location-based impacts - power/network outages, weather damage/flooding, fire, etc.

You can make a remote backup. That can be in a location you own / control, or you can pay someone else to use their storage.

Each layer of redundancy adds cost and complexity.

AWS says they can guarantee 99.999999999% durability and 99.99% availability. You can absolutely design your own system that meets those thresholds, but that is far beyond what one server with a RAID array can do.

2 comments

How many businesses or applications really need 99.999999999% durability and 99.99% availability? Is your whole stack organized to deliver the forementioned durability and availability?
I think that this is, to Andy's point, basically about simplicity. It's not that your business necessarily needs 11 9s of durability for continuity purposes, but it sure is nice that you never have to think about the durability of the storage layer (vs. even something like EBS where 5 9s of durability isn't quite enough to go from "improbable" to "impossible").
There are a lot of companies who their livelihood depends on their proprietary data, and loss of that data would be a company-ending-event. I'm not sure how the calculus works out exactly, but having additional backups and types of backups to reduce risk is probably one of the smaller business expenses one can pick up. Sending a couple TB of data to three+ cloud providers on top of your physical backups is in the tens of dollars per month.
Different people and organizations will have different needs, as indicated in the first sentence of my post. For some use cases one server is totally fine, but it's good to think through your use cases and understand how loss of availability or loss of data would impact you, and how much you're willing to pay to avoid that.

I'll note that data durability is a bit of a different concern than service availability. A service being down for some amount of time sucks, but it'll probably come back up at some point and life moves on. If data is lost completely, it's just gone. It's going to have to be re-created from other sources, generated fresh, or accepted as irreplaceable and lost forever.

Some use cases can tolerate losing some or all of the data. Many can't, so data durability tends to be a concern for non-trivial use cases.

> One server with a RAID array can survive, usually, 1 or maybe 2 drive failures.

Standard RAID configurations can only handle 2 failures, but there are libraries and filesystems allowing arbitrarily high redundancy.

As long as it's all in one server, there's still a lot of situations that can immediately cut through all that redundancy.

As long as it's all in one physical location, there's still fire and weather as ways to immediately cut through all that redundancy.