Hacker News new | ask | show | jobs
by photonthug 890 days ago
Despite years of friendly sounding devops philosophy there's times when devs and ops are fundamentally going to be in conflict. it's sort of a proxy war between devs who understandably dislike red tape and management who loves it, with devops caught in the middle and on the hook for both rapid delivery of infrastructure but also some semblance of governance.

An org with actual governance in place really can't deliver infra rapidly, regardless of whether the underlying stuff is cloud or on prem, because whatever form governance takes in practice it tends to be distributed, i.e. everyone wants to be consulted on everything but they also want their own responsibility/accountability to be to be diluted. Bureaucracy 101..

Devs only see ops taking too long to deliver, but ops is generally frozen waiting on infosec, management approving new costs, data stewards approving new copies across ends, architects who haven't yet considered/approved whatever Outlandish new toys the junior devs have requested, etc etc.

Depends on exactly what you're building but with a competent ops team cloud vs on prem shouldn't change that much. Setting aside the org level externalities mentioned above, developer preference for stuff like certain AWS apis or complex services is the next major issue for declouding. From the ops perspective cloud vs on prem is largely gonna be the same toolkit anyway (helm, terraform, ansible, whatever)

2 comments

Whilst often true in practice, this doesn't have to be true.

The reality is, a lot of these orgs have likely already discovered devops, pipelines, deployment strategies, observability, and compliance as code.

There's basically little in compliance that can't be automated with patterns and platforms, but in most of these organizations a delivery teams interface with the org is their non-technical delivery manager who folds like a beach chair when they're told no by the random infosec bod who's afraid of automation.

I've cracked this nut a few times though. It requires you be stubborn, talk back, and have the gravitas and understanding to be taken seriously. i.e. yelling that's dumb doesn't work, but asking them for a list of what they'd check, and presenting an automated solution to their group, where they can't just yell no, might.

Yes, of course management is often the problem.

I think it helps when people actually take a step back and understand where the money that pays their salary comes from. Often times people are so ensconced in their tech bureaucracy they think they are the tail that wags the dog. Sometimes the people that are the most hops from the money are the least aware of this dynamic. Bureaucracies create an internal logic of their own.

If I am writing some internal software for a firm that makes money selling widgets, and I decide that what we really need is a 3 year rewrite of my app for reasons, am probably not helping in the sale or the production of widgets. If another team is provisioning hardware for me to write the software on, and it now takes 2 weeks to provision virtual hardware that could take seconds, then they are also not helping in the sale or the production of widgets.

These are the kind of orgs that someone may one day walk into, blast 30% of the staff, and find no impact on widget production, and obvious 30% savings on widget costs...

> If another team is provisioning hardware for me to write the software on, and it now takes 2 weeks to provision virtual hardware that could take seconds, then they are also not helping in the sale or the production of widgets.

Well in this example, the ops team slowing down pointless dev work by not delivering the platform that work is going to happen on quickly are effectively engaged in costs savings for the org. The org is not paying for the platform, which helps them because the project might be canceled anyway, and plus the slow movement of the org may give them time to organize and declare their real priorities. Also due to the slow down, the dev and the ops team are potentially more available to fix bugs or whatnot in actual widget-production. It's easy to think that "big ships take a while to turn" is some kind of major bug or at least an inefficiency, but there are also reasons orgs evolve in that direction and times when it's adaptive.

> Often times people are so ensconced in their tech bureaucracy they think they are the tail that wags the dog.

Part of my point is that, in general, departments develop internal momentum and resist all interface/integration with other departments until or unless that situation is forced. Structurally, at a lot of orgs of a certain size, that integration point is ops/devops/cloud/platform teams (whatever you call them). Most people probably can't imagine being held responsible for lateness on work that they are also powerless to approve, but for these kind of teams the situation is almost routine. In that sense, simply because they are an integration point, it's almost their job to absorb blame for/from all other departments. If you're lucky management that has a clue can see this happening, introduce better processes and clarify responsibilities.

Summarizing all that complexity and trying to reduce it to some specific technical decision like cloud vs on-prem is usually missing the point. Slow infra delivery could be technical incompetence or technology choices, but in my experience it's much more likely a problem with governance / general org maturity, so the right fix needs to come from leadership with some strong stable vision of how interdepartmental cooperation & collaboration is supposed to happen.