Hacker News new | ask | show | jobs
by lijok 1036 days ago
This has to be the worst take on IAC organization I have ever seen. I would have never thought someone would try to apply the osi model to infra code management.

How long does it take to deploy a new service with this approach? A week?

6 comments

A few reminders from the HN commenting guidelines:

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

https://news.ycombinator.com/newsguidelines.html

Normally I would absolutely agree, but this article is the equivalent of one called "Flying a Boeing 737" in which the author advises you to invert the plane, fly at sea level and turn off all instruments. That is how little the proposed approach in this article makes sense.

Given that, how is one supposed to reply critically to such a post? I'm genuinely curious and open to suggestions, as it's something I'm clearly not good at.

>Given that, how is one supposed to reply critically to such a post? I'm genuinely curious and open to suggestions, as it's something I'm clearly not good at.

Link to or describe a better approach and explain specifically why it's better.

Your only specific part was that it's slow to deploy a new service which for most organizations is somewhere on the bottom of their priority list. In fact many organizations probably prefer slow deployments as that implicitly discourages unnecessary services and infrastructure bloat. That in turn lowers the long term maintenance burden and technical debt. Five hundred services that are in reality owned by no team or whom no one knows about are not what you want in an organization.

You reply critically to such a post by using experience, references and rhetoric to explain your position. You know this.

These Reddit level comments don't belong on this site.

Do you know what tone policing is? hn is one of the most conservative echo-chambers I visit (yes I should stop) because constantly everyone is always being tut-tut-tut-ed.
I disagree, in a lot of cases a new service would require an update to 1 or 2 stacks only, and only those 2 stacks need deploying.

It is in some cases required to do some version of this as the vendor API support does not allow for proper feedback when an operation is complete so it needs to settle.

Or, building docker images which run each time and take a long time / resources unnecessarily (I believe Pulumi have a fix for some version of this).

If your stacks are deployed via CI/CD, it’s not really a big deal to deploy 10x stacks in sequence, or just.

This may be overkill for a lot of projects but it’s valuable insight from a respected organization / individual.

So what is a good take?
Whoa, you expect someone to drop an inflammatory opinion AND offer a reasonable alternative?

What's next, real world data to back up their claims? Research papers offering corroborating evidence?

This is the internet, we don't do that here.

I'll also accept a peer reviewed HN comment chain in lieu of academic journals.
Haha, you're on point
I'd say dont do any layering and stick with the standard naming convention of "stacks". Start with a common stack with all your common stuff and application stacks with stuff that specific to some application say all the resources for a microservice or everything for a BI system ...etc.

Avoid splitting this up as it introduces too much complexity. The IAC code should be very simple such that any dev can pick it up just coming off the tutorials.

Company I'm in has 3 layers and dozens of stacks and it's made the whole thing impossible to reason about. No one wants to touch it anymore which means we now have a Platform team that screws around with this chap for months on end.

Note: Lee Briggs works for Pulumi as a Principal Platform engineer so its in their interest to make this too complicated.

> I'd say dont do any layering

> Start with a common stack with all your common stuff

> application stacks with stuff that specific to some application

So... layers! Right then.

> Note: Lee Briggs works for Pulumi as a Principal Platform engineer so its in their interest to make this too complicated.

ding ding: we have a winner

Global namespace shared resources, regional namespace shared resources, then each app provisions its own bit, consuming/linking the two aforementioned layers.

Everyone gets here eventually and you can just fight over stuff like “is an alb shared regional or app specific”

Put stateful resources in one bucket, non-stateful in another, and then do that until it causes obvious issues then revisit later.

Infra should be simple as possible, and the simple infra should inform simple app design.

In what mature organization does it take any less than a week to deploy a new service?
In what mature organization takes it more than a week? My current org is quite mature: You probably use it. It's also not a cloud provider: Our aws bill is in the high 8 figures a month. And yet launching a new service not directly pingable from the internet, and deployed in, say, 5 regions, is a matter of 3 PRs, adding the service to CI included. I've gone from having no repo at all to deployment in 4 days, because we were in a big hurry. All the infra-defining PRs will get eyes from an SRE or three, but the team that is writing the service is writing the PRs.

I bet we have far more instances under our name than the people that write this article, and yet we have nowhere near that level of complexity in our IaC definitions. And yet, somehow we manage. I guess we are immature?

Takes 30 minutes in ours, tops. Provision a new AWS account, 10~ min, copy/paste generic service template, plug in the vars and deploy, 20~ min.

We however had the advantage of building IAC from ground up and had the time to do it properly.

There’s a huge difference between “identical clone of an existing service” and “a new service”.

My challenge at $dayjob is that it takes months to spin up a new cloud service because they’re new.

Either a new app that wasn’t on the cloud before — in which case the templates need extensive customisation.

Or, new app in the sense that the devs just cracked open Visual Studio and have no idea yet what they actually need from the cloud.

I get maybe 10-20 copies of a template (dev/tst/prd + ha/dr), and then I have to start from the beginning.

Guidance on how to maximise reusability would actually be very useful.

Unfortunately, in the real world, this seems difficult. Many small variations in requirements tends to make abstractions leaky.

For example, one vendor requires active-passive load balancing for licensing reasons. Millions of dollars worth of licensing reasons. Neither AWS nor Azure support anything but active-active in any of their load balancers. (They do in DNS, but for various reasons that won’t work for us.)

Another “new” app (industrial air quality monitoring) is actually from the stone ages and doesn’t support PaaS databases or even 3 of the 4 clustering modes available in IaaS. So a custom load balancer solution is required… just for it.

This is the issue. Everyone that loves the cloud and says it’s simple has easy mode turned on: cookie cutter clones they can stamp out for many identical customers or whatever.

Some people play the game with “big government” difficulty.

I could have elaborated. When I say "generic service template", I mean a service with any cloud requirements (that we've had at any point before) can be assembled from building blocks (TF modules) in 20~ min. This ofcourse doesn't work if every new service has new unique requirements.

Happy to talk through the setup we're using and some other setups I've seen work if you're interested, but it's quite extensive.

Our current setup is heavily inspired by the work Gruntwork (not affiliated) are doing. I highly recommend taking a look at how they do it and even subscribing to their service if the need is there. They provide pre-built modules for basically any usecase.

References to anything public would be nice

So far I’ve struggled to get any real modularity.

The biggest issue I’ve had is that every assumption I’ve made has been violated.

So either modules must be fully generic — saving no work at all — or make assumptions and then not be reusable.

It’s the same amount of code smh.
thanks for the feedback!