| HN Mirror

I work with a large group of engineers that manage a very large array of infrastructure. We see weird Terraform issues all the time. There's a multitude of ways that Terraform gets into a bad state and has to be fixed manually (in production). Even a Terraform expert runs into them, because it's not necessarily an issue "with Terraform", but with a buggy provider, or some feature of Terraform which wasn't tested well in certain scenarios, etc.

Terraform allows for too much complex configuration/operation, the codebases change too frequently, there's not enough testing, and even extremely simple operations fail in a way that can't be reverted automatically. In practice the tool is clunky, complicated, difficult, and unreliable. Whenever I run "terraform apply" I know I am rolling dice, and plan for how I'm going to recover everything if I need to (which was what Terraform was supposed to prevent!)

But at the same time, if lots of people need to manage the same infra, you really have to use some common tool. Bash scripts are a great fix for small isolated problems, but they don't scale.