Absolutely true. The RAID controller would randomly lose drives and the driver for it would randomly cause kernel panics. We tried different firmwares and different kernels and made some progress, but never really got it stable under load.
However, that's the risk you run with single points of failure. Put all your data on one big box, and any failure in your RAID hardware, RAID firmware, RAID drivers, network drivers, kernel, RAM, OS, et cetera will take down the big box and thus take down anything relying on it.
The lesson I learned wasn't to make a super-robust single system, it was to have enough redundancy to stay up when something inevitably fails.
However, that's the risk you run with single points of failure. Put all your data on one big box, and any failure in your RAID hardware, RAID firmware, RAID drivers, network drivers, kernel, RAM, OS, et cetera will take down the big box and thus take down anything relying on it.
The lesson I learned wasn't to make a super-robust single system, it was to have enough redundancy to stay up when something inevitably fails.