|
|
|
|
|
by Puts
284 days ago
|
|
My experience after 20 years in the hosting industry is that customers in general have more downtime due to self-inflicted over-engineered replication, or split brain errors than actual hardware failures. One server is the simplest and most reliable setup, and if you have backup and automated provisioning you can just re-deploy your entire environment in less than the time it takes to debug a complex multi-server setup. I'm not saying everybody should do this. There are of-course a lot of services that can't afford even a minute of downtime. But there is also a lot of companies that would benefit from a simpler setup. |
|
In all those years, I’ve had precisely one actual hardware failure: a PSU went out. They’re redundant, so nothing happened, and I replaced it.
Servers are remarkably resilient.
EDIT: 100% uptime modulo power failure. I have a rack UPS, and a generator, but once I discovered the hard way that the UPS batteries couldn’t hold a charge long enough to keep the rack up while I brought the generator online.