|
|
|
|
|
by gnur
1683 days ago
|
|
> Less than 100% reliability is essential This is actually a take most SRE's would / should believe.
Every added 9 to the reliability increases the price exponentially. Finding the correct level of reliability is something most companies should focus more on, because sometimes a single physical machine that could go down once a year for a few hours is perfectly capable of providing all the resources a medium seized business could need. Proper backups, monitoring and recovery runbooks can even decrease the downtime of such a simple system to minutes, while easily saving you maybe thousands per month. |
|
Slowing your young company down in order to turn 0.9995 to 0.9998 is almost always a terrible trade. Even turning 0.995 to 0.999 is hard to justify in most places. (That improvement saves about 35 hours of downtime per year.)