|
|
|
|
|
by e1ven
5543 days ago
|
|
For mission critical things, you want to reduce the points of failure, and ensure that when things DO go wrong, you have a reasonable escalation path. It's now been over 8 hours since it went down, and no fix from Google yet. 4+ days on the missing file. If I was running in house, I could have entirely restored the mail server from tape by now. I could have swapped over to a hot-spare in a few minutes. I could have failed over to our backup internet service. I have a lot of options. With Google, my option is to wait.. And hope my business doesn't lose too much money while Google gets around to fixing it. |
|
http://don.blogs.smugmug.com/2007/01/30/amazon-s3-outages-sl...
"""So what are we doing differently? Simple. Amazon serves as “cold storage” where everyone’s valuable photos go to live in safety. Our own storage clusters are now “hot storage” for photos that need to be served up fast and furious to the millions of unique visitors we get every day. That’s a bit of an oversimplification of our architecture, as you can imagine, but it’s mostly accurate."""
You can always maintain a hot-backup, fail-over of your site on your own servers -- perhaps with reduced functionality until the scalable cloud services come back online. For a mission-critical site, this would seem to be a reasonable tradeoff.