Hacker News new | ask | show | jobs
by throwawaaarrgh 849 days ago
The one thing I want, that doesn't exist, and won't for at least 10 years: immutable infrastructure.

Oh, the concept exists. I can make some infrastructure mostly-immutable, myself. But the cloud doesn't give me it out of the box. What the cloud gives me are APIs. If I write software to call those APIs, predict what the allowed values are, predict the failures I might see, write about 5,000 lines of code to handle the failures, attempt to reconcile differences, retry, store my artifacts, reference them, after implementing a build system, etc, I can get one or two things to be immutable. But for the vast majority of services it's actually impossible.

Take an S3 bucket. Can you make an S3 bucket immutable? The objects inside it might be versions, sure. Can you roll back all the objects in the bucket to Version 123? Can you roll back the S3 policy back to revision 22? Can you make it also roll back the CORS rules? Can you diff all these changes and see a log of them? Can you tell the bucket to fix itself back to the correct expected version of itself? Can you tell it to instead adopt 3 new changes, as part of a version of the S3 bucket you tested somewhere else? The answer is "no".

You can fake it, with a configuration management tool like Terraform. But that's as immutable as a file on your filesystem. Any program can overwrite your files at any time; you have to have Puppet configured to monitor your files, and constantly fix the files when they get changed, track the Puppet code in Git, keep your own log of changes, etc. That filesystem isn't immutable, it's mutable! If it was immutable you wouldn't have to use Puppet (or Terraform). And the sad thing is we're all stuck on Terraform, which is actually terrible for a configuration management tool, because it mostly refuses to reconcile inconsistencies (the way every other configuration management tool in history has). It just bombs out and says "Oh shit, that wasn't a change I planned, and you didn't write this HCL code to handle this weird condition, so I'm just gonna bail and not fix this. Good luck getting production working again." Puppet wouldn't stop working if something other than Puppet updated a file. But nobody seems to mind that we literally regressed in functionality, because a company made up new marketing terms for their tools.

Sadly this desired built-in immutability, and the declarative nature of it, won't be built into S3 or other tools for at least a decade or two. They would need to effectively build something akin to K8s just to manage their own components immutably and expose an entirely new API. So we are doomed to do Configuration Management in the cloud, until the cloud starts implementing immutability out of the box.

1 comments

Yeah, sadly true. While I am not a platform engineer I've witnessed their plight many times and I truly sympathize.

Now more than ever because I started making an effort to self-host much more than before... the amount of scripts I have to write just to achieve idempotency, nevermind immutability, is staggering, and I am already questioning my approach. Will likely start making use of ZFS or BTRFS snapshots, or I don't know, I'll just start snapshotting manually the entire filesystem on my Linux machines (like store all dir/file paths with their sizes and modification dates; it's a start and you can diff against such "snapshots").

I am just not comfortable with running commands and not having an idea what and where changed. It's insane that everyone is just accepting this! I am not okay with it, I want to see an exact breakdown on what changed and where and how.

IMO working on this and bringing it to the mainstream is loooong overdue.

I think it's that few people can see its potential. When I first started using immutable infra like 10 years ago, and saw how many problems it solved, my mind was blown. Until I saw the difference myself, it just looked like some trivial CS concept.

It's not apparent that problems X, Y and Z will be solved by immutability. Once it's applied everywhere, whole classes of problems just disappear. But until people see the problems disappear, they won't implement it. Catch-22.

True, plus not many devs are directly exposed to the problems and thus the will to fix the problem never has a chance to materialize.

One of the best-oiled teams I was in had devs and sysadmins work together closely. If Jim made a huge Python mess out of its small throwaway project (that the CEO needed because he wanted a nice chart for an investor meeting) that required several virtual environments and a particular (older) version of something then the sysadmin had the power to call him out and question his methods. While not many programmers appreciate that, those that do make for a more positive workplace IMO.

RE: idempotency / immutability in general, I heard about Nix many times but I have been put off every time I tried it: cutesy (and rather dumb) terminology like pills and flakes and such, a Haskell dialect the world really did not need, tight binding between things (forgot which at this point, sorry), and the list kept growing until I just gave up. With all their quirkiness and edge cases my scripts still beat the pants off of Nix for my own goals. I mean, pacman/yay have a flag that says "only install this package if not already installed" so... ¯\_(ツ)_/¯

But I really do want something like Nix (and no, not Guix either). Not only for packages -- for the entire system. I want to be able to plug an USB drive and issue a command that says "show me new devices plugged in the last 5 minutes, or last time I checked".

We don't have stuff like that. Or if we do, I am blissfully unaware of it. Can't we just start writing them and push their adoption? Every sysadmin team invents magic from scratch. Surely we can and should collectively do better...