Hacker News new | ask | show | jobs
by Symbiote 2670 days ago
I have 30 machines running at 100% load for an hour, then 25% load for an hour, the pattern repeats. Every quarter they run 100% for about a week.

Every 6 months or so a hard drive fails (out of 16 per server), no other components have failed. 10 machines are 6 years old (test system as it's out of warranty), 10 3 years old, 10 new.

There's also 20 or so other servers under different loads, I've not had anything fail other than hard drives.

When I started the job, there were spare power supplies for some 10+ year old servers in storage, so I'm either lucky or reliability is improving.