|
|
|
|
|
by turtlebits
1475 days ago
|
|
Watching the demo, how is this more than a glorified Slack workflow to a generic ticketing system? The only additions I can see are lifecycle events that update your ticket and the postmortem reminder? This could be easily done with an engineer/tech following a runbook/doc that's linked in the alert. Cutting tickets, creating a slack channel, escalating, and/or setting up a bridge can be done in minutes (and most alerting systems can automate that). IME, defining the and enforcing the process is more important than the automation. |
|
You could certainly go through a checklist/runbook that got attached to the incident. This is usually the setup most companies we speak with have. However, compliance, consistency, and speed on following that during a stressful incident tends to be quite poor. Tasks like creating Slack channels, Zoom bridge, I agree aren't difficult but not things expensive engineers should be focused on. We want them to focus on putting out the fire and not the admin.
A few things for example a simple doc won't be able to accomplish:
- automatically track incident metrics
- set recurring reminders to e.g. update the statuspage
- auto-archive channels after periods of inactivity
- create incident timeline without copy-pasting
- update Jira/Asana/Linear tickets with incident metadata and action items
- automatically invite responders to the incident channel
None of which are impossible to do manually, just a question of how much time you'd like peoples spending on it. If you have a few incidents a year I agree this is likely overkill.