|
|
|
|
|
by cullenking
445 days ago
|
|
Enterprise server gear is pretty reliable, and you build your infra to be fully redundant. In our setup, no single machine failure will take us offline. I have 13 machines in a rack running a > 10mm ARR business, and haven't had any significant hardware failures. We have had occasional drive failures, but everything is a RAID1 at a minimum so they are a non issue. We just replaced our top of rack firewall/proxies that were 11 years old and working just fine. We did it for power and reliability concerns, not because there was a problem. App servers get upgraded more often, but that's because of density and performance improvements. What does cause a service blip fairly regularly is a single upstream ISP. I will have a second ISP into our rack shortly, which means that whole class of short outage will go away. It's really the only weak spot we've observed. That being said, we are in a nice datacenter that is a critical hub in the pacific northwest. I'm sure a budget datacenter will have a different class of reliability problems that I am not familiar with. But again, an occasional 15m outage is really not a big deal business wise. Unless you are running a banking service or something, no one cares when something happens for 15m. Heck, all my banks regularly have "maintenance" outages that are unpredictable. I promise, no one relaly cares about five nines of reliability in the strong majority of services. |
|