|
|
|
|
|
by darkwater
1610 days ago
|
|
> Yes. This was a multi-day outage and eventually the oncall does need sleep, so you need more of the team to help with it. Well, also your business is 100% down, all the capable engineering eyes should be looking at the issue. |
|
And at or beyond that point, you pretty much have to take inspiration from fire fighters and emergency services: You need to organize the experts on subsystems to rest and sleep in shifts, ideally during simpler but time consuming tasks. Otherwise these persons will crash and you lose their skills and knowledge during that outage for good. And that might render an outage almost impossible to handle.