Hacker News new | ask | show | jobs
by efxhoy 697 days ago
> Running on-call well is a culture problem. You need management to prioritize observability (you can't fix what you can't show as being broken), then you need management to build a no-broken-windows culture (feature development stops if anything is broken).

I was lucky enough to join a company where management does this. The managers were made to do this by experienced engineers who explained to them in no uncertain terms that stuff was broken and nothing was being shipped until things stopped being broken. Unless you have good managers this won’t happen without a fight and it’s a fight I think we as engineers need to take.

Some managers in other teams played the “oh it’s not super high impact it’s not prioritized” game, and those teams now own a bunch of broken stuff and make very slow progress because their developers are tiptoeing around broken glass, and end up building even more broken stuff because nothing they own is robust. Those managers played themselves.

Communication with management is bidirectional, sometimes they need a lot of persuasion.

2 comments

> Communication with management is bidirectional, sometimes they need a lot of persuasion.

Sounds like managing up, i.e. doing IC workload and the manager's job. Hard pass.

If you'd rather be miserable at work instead of content at work, that's a choice.
I tried that approach with a colleague and it just got more and more heated and frustrating. At the same time we were getting heat for reliability. I ended up quitting. Since then I heard from a colleague that they made some staff redundant, on a team that was already underwater.

I doubt very much that my experience was unique. In my new position we have the same problems with reliability but I don’t get involved in the political side of trying to argue about it, just turn up and do my 9-5. I’m a lot less stressed now!