|
|
|
|
|
by beachstartup
4137 days ago
|
|
we don't exactly run the biggest operation, but in our experience the most common failure items in thousands-of-years-of-cumulative-uptime is network interface cards (or on-motherboard network interfaces) and hdd's. RAID controllers fail left and right. we keep tons of spares around. ssd's fail few and far between, cpus basically do not fail, and memory can go bad but it's exceedingly rare and easy to fix. psu's fail but are easy to fix in modern computers as well (slide-out, redundant, etc.) having said all that, heat is the primary killer of hardware. if you run a lot of equipment in a dense environment, get a laser thermometer and find your hot spots and fix them with some industrial fans or move your gear around. once your stuff gets hot anything can fail in weird and mysterious ways. |
|