|
|
|
|
|
by pokoleo
451 days ago
|
|
From my experience working on SaaS, and improving ops at large organizations, I've seen that "on-call culture" often exists inversely proportional to incentive alignment. When engineers bear the full financial consequences of 3AM pages, they're more likely to make systems more resilient by adding graceful failure modes. When incident response becomes an organizational checkbox divorced from financial outcomes and planning, you get perpetual firefighting. The most successful teams I've seen treat on-call like a leading indicator - every incident represents unpriced technical debt that should be systematically eliminated. Each alert becomes an investment opportunity rather than a burden to be rotated. Big companies aren't missing the resources to fix this; they just don't have the aligned incentive structures that make fixing it rational for individuals involved. The most rational thing to do as an individual on a bad rotation: quit or transfer. |
|