Hacker News new | ask | show | jobs
by gizdan 1817 days ago
This sounds like a lack of understanding of terraform. We use Terraform pretty heavily and I've rarely seen bad states across our whole org, and the few that I do see are usually people who don't know the core concepts (often non-devops engineers).

Terraform has its faults, but it is the best in its class, especially when you need to manage infrastructure beyond a single cloud provider (e.g. we manage our datadog monitors and dashboard, pagerduty alerts and much more). The only other thing that would probably thrash it is pulumi, which has similar concepts, except you can many different languages as opposed to HCL (no CDK doesn't count because it is very immature still and last I checked it only supported one or two languages).

2 comments

I work with a large group of engineers that manage a very large array of infrastructure. We see weird Terraform issues all the time. There's a multitude of ways that Terraform gets into a bad state and has to be fixed manually (in production). Even a Terraform expert runs into them, because it's not necessarily an issue "with Terraform", but with a buggy provider, or some feature of Terraform which wasn't tested well in certain scenarios, etc.

Terraform allows for too much complex configuration/operation, the codebases change too frequently, there's not enough testing, and even extremely simple operations fail in a way that can't be reverted automatically. In practice the tool is clunky, complicated, difficult, and unreliable. Whenever I run "terraform apply" I know I am rolling dice, and plan for how I'm going to recover everything if I need to (which was what Terraform was supposed to prevent!)

But at the same time, if lots of people need to manage the same infra, you really have to use some common tool. Bash scripts are a great fix for small isolated problems, but they don't scale.

I completely agree with your points there and that is probably the issue.