Hacker News new | ask | show | jobs
by chc 4615 days ago
The abrasive headline is kind of unfortunate, as the actual moral of the story given at the end is exactly the right takeaway: Never assume your hardware is infallible, so always have backups that you know you can use when your server experiences a wildly improbable catastrophe.

Also, very impressed by Digital Ocean's response here. Given their reputation as a budget host, they really do put a lot of effort into service.

1 comments

> wildly improbable catastrophe

Or an extremely probable one like a hard disk failure. They only last a few years; most data centers see an annual replacement rate in the 2-13% range. The failure rate is a known quantity, and their limited 1-3 year warranties that reflect that expectation.

There isn't a host I've used more than a few years where I haven't seen hard drives (and power supplies) fail. I don't know if my experience is typical, but hardware RAID controllers seem to go bad on me not-infrequently too, losing the whole array at once. They don't pay you when it happens, they just replace it. DO was extremely generous here.

Was going to say the same thing, Dual drive failure on a RAID5 system with five 2TB drives is 1 in 12. With 3TB drives that goes up to 1 in 7.

The underlying issue is that the uncorrectable read error rate is 1 in 10^15 bits, this is just physics (thermal noise, read/write signal loss, etc) But with 8b/10b encoding that is only 90TB worth of bits. Rebuilding a RAID group of 5 with four 2TB "good" drives (8TB of data to be read) you will see a failure in one of the other 4 drives 1 in 11.25 times. (90/8). With 3TB drives 1 in 7.25 times. Using simple mirroring you won't be able to re-silver a mirror 1 in 1:45 or slightly more than 2% of the time for 2TB drives.

Dual parity, or triple mirrors (x3) are now the minimum bars for making storage reliable.

Well it's just a bit unlucky to have both drives fail in a RAID (although not impossible).