|
|
|
|
|
by fastball
372 days ago
|
|
The idea as stated is a poor one, but a slight reshuffling and it seems promising: You generate code with LLMs. You write tests for this code, either using LLMs or on your own. You of course commit your actual code: it is required to actually run the program, after all. However you also save the entire prompt chain somewhere. Then (as stated in the article), when a much better model comes along, you re-run that chain, presumably with prompting like "create this project, focusing on efficiency" or "create this project in Rust" or "create this project, focusing on readability of the code". Then you run the tests against the new codebase and if the suite passes you carry on, with a much improved codebase. The theoretical benefit of this over just giving your previously generated code to the LLM and saying "improve the readability" is that the newer (better) LLM is not burdened by the context of the "worse" decisions made by the previous LLM. Obviously it's not actually that simple, as tests don't catch everything (tho with fuzz testing and complete coverage and such they can catch most issues), but we programmers often treat them as if they do, so it might still be a worthwhile endeavor. |
|