|
|
|
|
|
by cyberpunk
230 days ago
|
|
> Most systems handle this defensively with locks and runtime validation. So i work at an org with 1000s of terraform repos, we use the enterprise version which locks workspaces during runs etc. everywhere else i’ve worked, we either just use some lock mechanism or only do applies from a specific branch and CI enforces they run one at a time. My question is: who is this aimed at and what problem is it actually solving? Running terraform isn’t difficult - thousands of orgs handle it no problem - the issues I have with it with it have never been around lock contention and race conditions.. |
|
As you said, the common practice is to use locks on state to guarantee that operations don't step on each other. This works, however the cost is that if it takes 5 minutes to perform an operation, only one person can be doing an operation at a time, so if 5 devs are modifying infrastructure, the last one has to wait 25 minutes just to get back the plan, even if those 5 people are not changing overlapping resources in the state.
The way that most people deal with this is they take their infrastructure and break it up across multiple root modules, and then when those root modules, break it up again, etc.
Stategraph is solving the problem of getting all of the performance benefits of breaking up your root modules without breaking up your root modules. It dynamically determines which resources each of those 5 devs are operation on and, if the resources do not overlap, can run them in parallel.
That means Stategraph is manipulating state in a bit more sophisticated way than standard Terraform/Tofu, and we need to be careful we don't get it wrong.