|
|
|
|
|
by pushedx
115 days ago
|
|
sorry, needed to edit this comment to ask the same question as the sibling: have you run these models in an agent mode that allows for executing the tests, the agent views the output, and iterates on its own for a while? up to an hour or so? you will get vastly different output if you ask the agent to write 200 of its own test cases, and then have it iterate from there |
|