| If you're using llms to shit out large swathes of unreviewed code you're doing it wrong and your project is indeed doomed to become unmaintainable the minute it goes down a wrong path architecturally, or you get a bug with complex causes or whatever. Where llms excel is in situations like: * I have <special snowflake pile of existing data structures> that I want to apply <well known algorithm> to - bam, half a days work done in 2 minutes. * I want to set up test data and the bones of unit tests for <complicated thing with lots of dependencies> - bam, half a days work done in 2 minutes (note I said to use the llms for a starting point - don't generate your actual test cases with it, at least not without very careful review - I've seen a lot of really dumb ai generated unit tests). * I want a visual web editor for <special snowflake pile of existing data structures> that saves to an sqlite db and has a separate backend api, bam 3 days work done in 2 minutes. * I want to apply some repetitive change across a large codebase that's just too complicated for a clever regex, bam work you literally would have never bothered to do before done in 2 minutes. You don't need to solve hard problems to massively increase your productivity with llms, you just need to shave yaks. Even when it's not a time save, it still lets you focus mental effort on interesting problems rather than burning out on endless chores. |
You would naively think that, as did I, but I've tested it against several big name models and they are all eventually "lazy", sometimes make unrelated changes, and worse as the context fills up.
On a small toy example they will do it flawlessly, but as you scale up to more and more code that requires repetitive changes the errors compound.
Agentic loops help the situation, but now you aren't getting it done in 2 minutes because you have to review to find out it wasn't done and then tell it to do it again N times until it's done.
Having the LLM write a program to make the changes is much more reliable.