Hacker News new | ask | show | jobs
by Spooky23 1716 days ago
Sure. As long as you plan for disaster.

The place where I worked had failure trees for every critical app and service. The goal for incident management was to triage and have an initial escalation for the right group within 15 minutes. When I left they were like 96% on target overall and 100% for infrastructure.