|
|
|
|
|
by adbachman
1469 days ago
|
|
We just picked them up across an engineering/tech org of ~350 to do precisely what they describe here. PagerDuty for notifications and on-call rotations, Datadog for monitoring, Slack for communication in-the-moment, Google Docs for post-mortem documentation; Blameless as the glue and automation that takes away a lot of the incidental mental overhead of communicating and documenting while the incident is happening. Super encouraging to see competition, though. A former teammate turned me on to https://how.complexsystems.fail/ and I'm willing to believe that in a complex enough system, the closest we will get to understanding how it actually works is during/after incident response. |
|
And that is great, we know plenty of happy Blameless customers, they're certainly one of the better ones in the market we compete against.