Hacker News new | ask | show | jobs
by wpietri 2287 days ago
Are you sure you two are talking about the same thing?

My understanding of immutable infrastructure is the same as immutable data structures: once you create something, you don't mess with it. If you need a different something, you create a new one and destroy the old one.

That doesn't mean that the whole picture isn't changing all the time. Indeed, I think immutability makes systems overall more fluid, because it's easier to reason about changes. Mutability adds a lot of complexity, and when mutable things interact, the number of corner cases grows very quickly. In those circumstances, people can easily learn to fear change, which drastically reduces fluidity.

3 comments

Yup. We do this. When our servers need a change, we change the AMI for example, and then re-deployment just replaces everything. Most servers survive a day, or a few hours.
Makes sense to me. I was talking with a group of CTOs a couple years back. One of mentioned that they had things set up that any machine more than 30 days old was automatically murdered, and others chimed in with similar takes.

It seemed like a fine idea to me. The best way to be sure that everything can be rebuilt is to regularly rebuild everything. It also solves some security problems, simplifies maintenance, and allows people to be braver around updates.

Configuration Management is still present in this process, it's just moved from the live system to the image build step.
Probably the most insightful comment in this entire thread. Thank you. In many cases, an "image" is just a snapshot of what configuration management (perhaps not called such but still) gives you. As with compiled programming languages, though, doing it at build time makes future change significantly slower and more expensive. Supposedly this is for the sake of consistency and reproducibility, but since those are achievable by other means it's a false tradeoff. In real deployments, this just turns configuration drift into container sprawl.
Is this still as painful as it used to be? AMI building took ages, so iteration ("deployment") speed is really awful.
Personally that's why I avoid Packer (or other AMI builders) and keep very tightly focussed machines set up by the cloud-init type process.
So, once you create a multi-thousand-node storage cluster, if you need to change some configuration, replace the whole thing? Even if you replace onto the same machines - because that's where the data is - that's an unacceptable loss of availability. Maybe that works for a "stateless" service, but for those who actually solve persistence instead of passing the buck it just won't fly.
Could you say more about why your particular service can't tolerate rolling replacement of nodes? You're going to have to rebuild nodes eventually, so it seems to me that you might as well get good at it.

And just to be clear, I'm very willing to believe that your particular legacy setup isn't a good match for cattle-not-pets practices. But I think that's different than saying it's impossible for anybody to bring an immutable approach to things like storage.

The person you're replying to didn't say "replace every node," they said "replace the whole thing."

To give a really silly example, adding a node to a cluster is a configuration change. It wouldn't make sense to destroy the cluster and recreate it to add a new node. There are lots of examples like this where if you took the idea of immutable infrastructure to the extreme it would result in really large wastes of effort.

Could you please point me at prominent advocates of immutable infrastructure who propose destroying whole clusters to add a node? Because from what I've seen, that's a total misunderstanding.
As I said, it's a silly example just to highlight an extreme. In between there are more fluid examples. I don't think it's that ridiculous to propose destroying and recreating the cluster in its entirety when you're deploying a new node image. However as you say I'm not sure anyone would advocate that except in specific circumstances.

On the other hand, while my suggestion of doing it to add a node sounds ridiculous I'm sure there are circumstances in which it's not only understandable but necessary, due to some aspect of the system.

I'm saying it's not even an extreme, in that I don't believe what people are calling "immutable infrastructure" includes that.

If your biggest objection to an idea is that you can make up a silly thing that sounds like it might be related, I'm not understanding why we need to have this discussion. I'd like to focus on real issues, thanks.

Wow, look at those goalposts go! If you make enough exceptions to allow incremental change, then "immutable" gets watered down to total meaninglessness. That's not an interesting conversation. This conversation is about configuration management, which is still needed in a "weakly immutable" world.
Again, could you please point me at notable advocates of immutable infrastructure proposing the approach you take such exception to? And note that I'm not proposing any exceptions.
Presumably you replace the parts that changed and keep the parts that didn't.
Interesting to say you've "solve[d] persistence" when you seem to be limited by it here. Is there a particular reason your services can't be architected in less stateful, more 12-factor way?
Kick the persistence can down the road some more? Sure, why not? But sooner or later, somebody has to write something to disk (or flash or whatever that doesn't disappear when the power's off). A system that stores data is inherently stateful. Yes, you can restart services that provide access or auxiliary services (e.g. repair) but the entire purpose of the service as a whole is to retain state. It's the foundation on top of which all the slackers get to be stateless themselves.
The vast majority of people simply redefine the terms to fit whatever they are selling.

If your systems are immutable they can run read-only. In the in nineties Tripwire, the integrity checker, popularized it. You could run it off cdrom. Today immutable infrastructure is VMs/containers that can be ran off a SAN or a pass through file system that is readonly. It means snapshots are completely and immediately replicatable. When you need to deploy, you take a base image/container, install a code onto it, run tests to ensure that it is not broken and replicate it as many times as you need, in a read-only state. This approach also has an interesting property where because system is readonly ( as in exported to the instance read-only/mounted by the instance readonly ) it is extremely difficult to do nasty things to it after a break in - if it is difficult to create files, it is difficult to stage exploits.

That's the only kind of infrastructure where configuration management on the instances themselves is not needed