Hacker News new | ask | show | jobs
by rozenmd 1099 days ago
It sounds like your team lacks a culture of continuous improvement - IMO in a product team on-call's full-time job is to make the next on-call engineer's job easier through deleting irrelevant alerts, automating fixes, and generally making the system more stable.

I wrote a longer guide about this here: https://onlineornot.com/incident-management/on-call/improvin...

1 comments

Yeah, I must agree it is a cultural issue at some extent. But honestly the on-call my current company is quite demanding. So during the on-call week, though engineers try to improve it they always run out of the time or miss few things which then puts burden on future on-call.

I think there should be a nice light weight tool which should give a clear summary and tracking mechanism which make this a quicker tasks. Even just to tag the runbooks which are not updated. All those notes get lost in documentations and never referred back.

In previous teams, we just used a JIRA backlog to manage these tasks
Yeah, JIRA could be handy and useful though you need to create tickets for every tasks with a rigorous monitoring with other backlog and story items.