This sounds like a lack of understanding of terraform. We use Terraform pretty heavily and I've rarely seen bad states across our whole org, and the few that I do see are usually people who don't know the core concepts (often non-devops engineers).
Terraform has its faults, but it is the best in its class, especially when you need to manage infrastructure beyond a single cloud provider (e.g. we manage our datadog monitors and dashboard, pagerduty alerts and much more). The only other thing that would probably thrash it is pulumi, which has similar concepts, except you can many different languages as opposed to HCL (no CDK doesn't count because it is very immature still and last I checked it only supported one or two languages).
I work with a large group of engineers that manage a very large array of infrastructure. We see weird Terraform issues all the time. There's a multitude of ways that Terraform gets into a bad state and has to be fixed manually (in production). Even a Terraform expert runs into them, because it's not necessarily an issue "with Terraform", but with a buggy provider, or some feature of Terraform which wasn't tested well in certain scenarios, etc.
Terraform allows for too much complex configuration/operation, the codebases change too frequently, there's not enough testing, and even extremely simple operations fail in a way that can't be reverted automatically. In practice the tool is clunky, complicated, difficult, and unreliable. Whenever I run "terraform apply" I know I am rolling dice, and plan for how I'm going to recover everything if I need to (which was what Terraform was supposed to prevent!)
But at the same time, if lots of people need to manage the same infra, you really have to use some common tool. Bash scripts are a great fix for small isolated problems, but they don't scale.
Agreed, I swapped my team from teraform to Ansible to SAM... SAM has been the most reliable and resilient and stable for my use cases (general serverless)
It's at least second or third worst. Worst would be writing your own deployment tool that does what CloudFormation (or TF or Pulumi) do. Second worst would be writing a tool that uses a templating language to generate CloudFormation and only using that.
If you prefer imperative infrastructure creation to declarative then I think you're doing something wrong. Both Terraform and CloudFormation are quite easy to manage compared to writing and managing scripts (bash or otherwise).
Using terraform for this is great is because it removes the unwanted alarms.
I had to create alarms when the instances auto scale and wrote a python script using cdktf and now the Jenkins job handles it. It even updates the cloudwatch dashboard.
It’s one of those things that really works pretty well but there are enough edge cases to make it slightly soul sucking.