Hacker News new | ask | show | jobs
by emrekzd 3392 days ago
In December 2015 I received an e-mail with the following subject line from AWS, around 4 am in the morning:

"Amazon EC2 Instance scheduled for retirement"

When I checked the logs it was clear the hardware failed 30 mins before they scheduled it for retirement. EC2 and root device data was gone. The e-mail also said "you may have already lost data".

So I know that Amazon schedules servers for retirement after they already failed, green check doesn't surprise me.

3 comments

So just as a FYI the reason that probably happened to you is that the underlying host was failing. I am assuming they wanted to give you a window to deal with it but the host croaked before then. I've been dealing w/ AWS for a long long time and I've never seen a maintenance event go early unless the physical hardware actually died...
That what happens when cloud provider doesn't support live migration for VMs.
That's completely ridiculous, get some fucking RAID Amazon.

I order drives off newegg directly to my DC and I'm yet to lose data with the cheapest drives available in RAID10.

Yes, solving problems at your scale and AWS' are quite comparable.
but I never lost data off an usb stick how hard could it be!
Really?!?! Several times USB sticks (and USB HDs) failed on me and other people I work with.
Not saying my scale is the the same at all - but the fact they can't do something so simple that I can do it as a single individual is embarrassing at best.

Simple solutions to this do scale - Linode and DigitalOcean don't have such issues for example - and while they're not Amazon scale, they are quite large and I'd say they prove the concept.

EBS data is backed up in multiple redundant ways (using erasure encoding I think).

Local storage is not intended for permanent storage, and is more use at your own risk. That's also why most of the new EC2 instances don't even support local storage.

Availability =/= durability of course

EBS is incredibly expensive and slow, not really a good solution. It'd be nice if they offered a better local storage option.
Incredibly expensive and slow compared to what? A 500 GB SSD (gp2) costs $50/m, and has 1500 - 3000 IOPS. It's okay for most loads.

For higher performance, you can use

1. EBS Provisioned IOPS (kind of expensive)

2. Aurora (for DB use)

3. The new I3 instances (super fast local storage at a reasonable price.)

I think most people rely on EBS and are happy with it. Sure it depends on the use case, but I think it works for most use cases.
It's not just a RAID that can fail. And everyone who uses AWS should expect failures. You should build your infrastructure to handle such failures well.
They offer no RAID on local storage and only the expensive, IO restricted EBS as an alternative.
Yes, the only way a server can die is from non-raided disks.
Otherwise they should at least be providing customers their data back.
I think you misunderstood the local storage. It is not intended to permanently store data. It's a volatile storage like RAM.