| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by technion 3695 days ago

    The problem with failures during rebuilds is overblown

I thought I was the only one who believed that. I've said this on reddit before and ended on like -20 votes with people blatantly arguing I'm falsifying an "impossibility".

I've got roughly 30 arrays in production, between 4 and 12 disks in each. All are RAID5 + hotspare. If you believe the maths people keep quoting, the odds of seeing a total failure in a given year is close to 100%. I started using this configuration, across varying hardware, over 15 years ago and I've been growing in number since.

I'm not pretending one example proves the rule, or that it's totally safe and I would run a highly critical environment this way (before anyone comments: these environments do not meet that definition), but people have tried to show maths that there's a six nine likelihood of failure, and I just don't for a second believe I'm that lucky.

2 comments

cmurf 3695 days ago

Well at least on Linux, by default almost everyone (using consumer drives) has their array in a very common misconfiguration. And this leads to raid5 collapse much sooner than it should.

The misconfiguration is the drive's SCT ERC timeout is greater than the kernel's SCSI command timer. So what happens on a URE is, the drive does "deep recovery" if it's a consumer drive, and keeps trying to recover that bad sector well beyond the default command timer of the kernel, which is 30 seconds. At 30 seconds the kernel assumes something's wrong and does a link reset. On SATA drives this obliterates the command queue and any other state in the drive. The drive doesn't report a read error, doesn't report what sector had the problem, and so RAID can't do its job and fix the problem by reconstructing the missing data from parity and writing the data back to that bad sector.

So it's inevitable these bad sectors pop up here and there, and then if there's a single drive failure, in effect you get one or more full stripes with two or more missing strips, and now those whole stripes are lost just as if it were a 2-disk failure. It is possible to recover from this but it's really tedious and as far as I know there are no user space tools to make such recovery easy.

I wouldn't be surprised if lots of NAS's using Linux were configured this way, and the user didn't use recommended drives because, FU vendor those drives are expensive, etc.

link

rincebrain 3695 days ago

Don't forget the part where many consumer drives won't let you play with the SCT ERC settings, and some of them just completely crap out on URE and won't come back.

(My personal favorite was when I discovered a certain model of "consumer" drives we had thousands of in production claimed to not support SCT ERC configuration, but if you patched smartctl to ignore the response to "do you support this", the drives would happily configure and honor it.)

link

jamesblonde 3694 days ago

Most enterprise-class drives are just consumer drives packaged with a bit more software, buy I guess you know that.

link

rincebrain 3694 days ago

Yeah, I was just entertained by how lazily the removal was implemented in the consumer drive FW.

link

jamesblonde 3695 days ago

Follow the money who is selling the raid5 is dead story. The main worry is correlated failures if you have the Sam types.of drives in arrays and they reach their end of life.

link