Hacker News new | ask | show | jobs
by cyberax 1066 days ago
My problem with the current IAS systems is the state storage. It should not be needed! Instead, the IAS tool should introspect the systems it's managing and build the necessary state on the fly.
3 comments

This does not work.

Say I have resource A with property X=1 I define in IAC. Someone comes along and modifies X=2 outside of state. With your way, the IAC tool would see that change and think it was naturally part of the desired state, whereas stored state will catch the drift. And before anyone says “well dont modify outside of IAC” I say 1) that’s often impractical and 2) sometimes automation can modify resources outside of IAC beyond your control.

Also, dynamically creating state creates all sorts of concurrency issues, which is another nice thing about stored state, you can put a lock on it.

First, you can guard against it. Just periodically re-run the infra code in the "dry-run" mode (from a CI/CD system) and scream if you see any differences.

Second, this is still fine. Don't make changes outside of the IAC control. And if you do make them, retro-fix the IAC files until there is no diff with the actual state.

Third, IAC should have an option to ignore some changes.

> Also, dynamically creating state creates all sorts of concurrency issues, which is another nice thing about stored state, you can put a lock on it.

In my experience, this is not a big issue in practice. Production deployments should be done through some kind of CI/CD, and it naturally serializes builds.

However, nothing stops you from adding locking without doing the full state management.

> Second, this is still fine. Don't make changes outside of the IAC control. And if you do make them, retro-fix the IAC files until there is no diff with the actual state.

This doesn't work in practice. Some aspects of the business want to tweak things and it should be reasonably guaranteed that the automated side never touches it.

Terraform state gives this assurance because it won't destroy resources not under its state.

> Some aspects of the business want to tweak things and it should be reasonably guaranteed that the automated side never touches it.

What would a legitimate case for this be?

It seems to me like any changes either must be done via IAC -- and tracked in source control, PR'd, tested in non-prod, etc -- or a missing feature.

If there's a legitimate case for modifying something not in IAC, it should be supported -- this is what I mean by "missing feature". The app and/or IAC should have code for that feature.

Modifying IAC-deployed settings is akin to someone hacking the binary of an executable from a software vendor while still expecting the vendor to support that modified executable. Not gonna happen.

There are 2 different use cases. One is where you wany your IAC configuration to be the source of truth - any changes made outside of IAC are drift and should be fixed. The other one is where you want to take the changes thatbare made OOB and update your IAC configuration to mirror then - in this case you use the IAC config to document the state of your live infra.
The IAC tool could just as easily recognize the change as a part of the current state instead of the desired, and revert the drift. Whereas stored state would likely miss the drift without another process to compare and update the stored state against the actual.
That is how Puppet works. Introspect the current state, compare with the desired state, fix as needed. It mostly works, but in reality it will never reach the point of introspecting literally all of the current state. So there are always ways to subtly break things without the tool noticing. (E.g., a file object that ensures the correct path, contents, ownership, and mode, but doesn’t check xattrs or ACL. [That’s hypothetical, not how the actual Puppet file module works.])
A lot of people will argue that state helps protect against drift, but the real reason I find that you have to have state is to store values that won't be returned a second time and still construct and connect the graph of resources in the IaC templates. For example, if you declare the need for an RDS database and connect its output credentials into another application, you'll need state in order for the applies to work a second time because you'll never be able to retrieve the values from the target provider again.
Yeah, and now all your creds are available for everyone to see. Instead use IAM authentication for RDS, or if it's impossible, store creds in SSM or Secrets Manager.

Yeah, it's not fully transactional, but it will work fine in practice.

State is just a poor crutch.