| I don't think you'll find a single framework that addresses everything you're looking for in your last paragraph. That being said, some advice: > Clearly define on-call priorities Sit down with your team, and, if necessary, one or two stakeholders. Create a document and start listing priorities and SLAs during a meeting. The goal isn't actually the doc itself, but when you go through this exercise and solicit feedback, people should raise areas where they disagree and point out things you haven't thought of. The ordering is up to what matters to your team, but most people will tie things to revenue in some way. You can't work on everything, and the groups that complain most loudly aren't necessarily the ones who deserve the most support. > balancing immediate production needs with Opex improvements Well, first, are your 'immediate production needs' really immediate? If your entire product is unusable that might be the case, but certain issues, while qualifying as production support, don't need to be prioritized immediately, and can be deferred until enough of them exist at the same time to be worked on together. Otherwise you can start by committing to certain roadmap items and then do as much production support as you have time for. Or vice-versa. A lot of this depends on the stage of your company; more mature companies will naturally prioritize support over a sprint to viability. > Manage long-term fixes related to past on-call issues without overwhelming current on-call engineers. Create a structured approach that ensures ongoing focus on improving operational experience over time. Whenever a support task or on-call issue is completed, you should keep track of it by assigning labels or simply listing it in some tracking software. To start off, you might have really broad categories like "customer-facing" and "internal-facing" or something like that. If you find that you're spending 90% of your support time on a particular service or process, that's a good sign that investment in that area could be valuable. Over time, especially as you get a better handle on support, you should make the categories more granular so you can focus more specifically. But not so granular that only one issue per month falls into them or anything like that. |