Hacker News new | ask | show | jobs
by clouded 1461 days ago
Of course, but that's not how I've seen it play out now, twice. They're eerily similar in going all in, full cloud mode kubernetes docker yaml hashicorp trendy product soup. Complete with architecture teams who try to wrangle it into a best practice starter repo for all engineering teams to use. I have no say in it otherwise yeah you can deploy on a bare ec2 instance but that's too simple for a serious enterprise software developer.
1 comments

Having a starter repo is a good sign. What should be happening is that such architecture teams (or platform teams as they're usually known now) hide all the complexity of how and where things run, beyond a simple set of guarantees handed to developers.

As much as I love setting up single instances and deploying directly on it, it does not scale and the quality of the deployment is very dependent on the developer. The quality and even existence of documentation for the same too.

architecture teams, platform teams or core teams as they are usually known have incentives that are misaligned with those of normal teams and more importantly with the business.

And the reason is indirection. They are two layers away from the business, shielded first by product people, then by product developers. As a consequence, they live in a bubble where nothing matters except a good starter repo or a new standard to push onto all the recalcitrant idiots that work on _products_.

I know that this notion of my job exists, but I don't agree with it at all. Making it easier to use your platform than building things from scratch is really the only way to sustainably build a platform people don't work around all the time. Being able to bake policy into that platform is just a nice bonus.

You are right about one thing though, I've seen enough undocumented deployment crap held together by hot glue prone to cause the next incident whenever someone hits the wrong button that my relationship with developers that dabble in ops is at least somewhat adversarial.

Note that this is different from developers that have a need they don't immediately know how to solve (and is not covered by a platform team) and either ask an architect (those are different from platform for us, by the way) or invest the time. That still often leads to cognitive overload or eternal temporary fixes, but it's still better than "why do you care, it works, I only had to punch 10 holes into the firewall and commit one service account key to git" type developers.

My personal experience with core teams has often been a disaster. The only successful core team I saw adopted the servant leadership approach, letting product teams make choices and staying pretty much out of their way while codifying existing practices and moderating discussions. It also served as interface to external infrastructure and engineering.

This core team was busted after a few years, its head fired and replaced by an oppressive junkie that started spewing corporate standards and imposing frameworks with the speed of an office printer.

You really have to have a special mindset to be willing to join a core team, and this mindset is opposite to what people that deliver a working product have.

There is a reason why "premature optimization is the root of all evil" is the motto of many generations of programmers. From their perspective, core teams are personified evil, because their only purpose is to optimize, prematurely.

My first experience on such a team was a team that started out improving a lacking standardized deployment process (including changing the target from a very bad slow scaling prone to errors AWS product - yes beanstalk - to k8s). The biggest benefit here was that the team was on the same floor as most developers, and the entire company was really small. We knew the challenges that developers faced.

We then eventually rewrote the platform from an "everything is implicit" approach (branch leads to deployment with stable URL x, logs end up at Y, metrics get scraped at Z) to "everything is explicit but we have a second component that emulates the old way unless overruled". Nothing changed for developers, except that they got a lot more levers.

Then that company was merged back into the mothership (it was a "moonshot startup make an online shop for us" kind of company) and everyone there that could be enthusiastic about technology was excited about the stack, considering they were stuck in the 2000s before. The tech turned out to be flexible enough to accomodate the needs of modern Scala/Node services and legacy PHP alike (with the help of a base image that included a little go proxy to add standard HTTP metrics).

Unfortunately there was a change in leadership to someone who wanted to essentially recreate a tech stack they had used elsewhere. Akamai to Cloudflare, AWS to GCP, Slack to Teams... unilaterally. The team imploded within a year and a large Kubernetes vendor came in to get developers on "standard tooling".

As far as I can tell, 2 years later, our infrastructure survives though. We built it pretty tough and I guess none of the other "standard tooling" solutions really fit. The vendor even ended up asking if they could open source the system. Unfortunately nobody in legal cared enough to figure that one out. I'd have liked it to survive somewhere.

It's a very unique story of a platform that actually evolved mostly organically, and I realize that most attempts at platforms don't work that way. I always try to take learnings from this into new attempts, and it has been working pretty well. Getting someone from platform to work in development teams is probably the most useful thing I'd recommend.