Hacker News new | ask | show | jobs
by brianwski 4528 days ago
Backblaze employee here - it is honestly just a spreadsheet that kicks out the answer. Every month we ask 20 or so suppliers for the lowest price for each drive type. If Hitachi are 10 percent more expensive but fail 10 percent less often, that balances out and we buy Hitachi. But if it is 12 percent more costly then we get the other brand. There is a tiny bit of free preference leeway given to Hitachi because it means less hassle to our over worked datacenter team...
1 comments

If I'm not reading it wrong then your data says that the Hitachi drives have half the Annual Failure Rate, or less, than the others (in your setup). Not sure what this means in MTBF but the Hitachi's sure seem to be worth a whole lot more, certainly 10, 20 or 30 percent more - no?
I don't think the math works. I posted this above:

I think the calculation is replacing one drive takes about 15 minutes of work. If we have 30,000 drives and 2 percent fail, it takes 150 hours to replace those. In other words, one employee for one month of 8 hour days. Getting the failure rate down to 1 percent means you save 2 weeks of employee salary - maybe $5,000 total? The 30,000 drives costs you $4 million, so who cares about $5k here or there?

The $5k/$4million means the Hitachis are worth 1/10th of 1 percent higher cost to us. ACTUALLY we pay even more than that for them, but not more than a few dollars per drive (maybe 2 or 3 percent more).

Moral of the story: design for failure and buy the cheapest components you can. :-)

Ok, after converting to MTBF the numbers make more sense: An AFR of 0.9% means a MTBF of 968947 hours (111 years). An AFR of 3.2% means a MTBF of 269346 hours (31 years).

I guess an MTBF of 31 years is plenty for your needs. Thanks again for sharing the data.

I think the failure rate will go up in old age. I just don't see those drives still working in 100 years.