|
|
|
|
|
by rich_sasha
46 days ago
|
|
People say LLMs do better on tasks where success is clear, like tests passing, and I can imagine it's true. Still, I find complex code fixes confirmed by tests end in the LLM fudging the code to make the specific test pass, rather than fixing the general issue. Like, where successful code run should generate a file and the test checks for the file, eventually LLM will just touch the file regardless and be done. |
|
This has completely solved the cheating and fudging to make tests pass for me.