| I've just spent the better part of two weeks trying to convince a LLM to automate some programming for me. We use feature flags. However, cleaning them up is something rarely done. It typically takes me ~3minutes to clean one up. To clean up the flag: 1) delete the test where the flag is off 2) delete all the code setting the flag to on 3) anything getting the value of the flag is set to true 4) resolve all "true" expressions, cleaning up if's and now constant parameters. 5) prep a pull request and send it for review This is all fully supported by the indexing and refactoring tooling in my IDE. However, when I prompted the LLM with those steps (and examples), it failed. Over and over again. It would delete tests where the value was true, forget to resolve the expressions, and try to run grep/find across a ginormous codebase. If this was an intern, I would only have to correct them once. I would correct the LLM, and then it would make a different mistake. It wouldn't follow the instructions, and it would use tools I told it to not use. It took 5-10 minutes to make the change, and then would require me to spend a couple of minutes fixing things. It was at the point of not saving me any time. I've got a TONNE of low-hanging fruit that I can't give to an intern, but could easily sick a tool as capable as an intern on. This was not that. |