I've had a similar experience, shipping new features at incredible speed, then waste a ton of time going down the wrong track trying to debug something because the LLM gave me a confidently wrong solution.
I think the parents post happened to everybody, and if it hasn’t it will.
The edge between being actually more productive or just “pretend productive” using large language models is something that we all haven’t completely figured out yet.
often it's something you casually overlook, some minor implementation detail that you didn't give much thought to that ends up being a huge mess later on, IME
Seems like LLMs would be well suited for test driven development. A human writes tests and the LLM can generate code passing all tests; ending with a solution that meets the humans expectations.
This is more or less how I use LLMs right now. They’re fantastic at the plumbing, so that I can focus on the important part - the business and domain logic.
I disagree because you're only considering the "get code to make the test pass". Refactoring, refining, and simplifying is critical and I've yet to see this applied well. (I've also yet to see the former applied usably well either despite "write tests generate code" being an early direction.)