Well that is not how anyone is doing agentic coding though. That sounds like just a worse version of traditional coding. Most people are building test suites to verify correctness and not caring about the code
Test suites don't verify correctness. They just ensure that you haven't broke something so bad the specific instances that the tests assert have turned into a failure. You can have a factorial function and more likely the test cases will only be a few numbers. Which does not guarantee correctness as someone who know about the test cases can just put a switch and return the correct response for those specific cases.
The compromise is worth it in traditional coding, because someone will care about the implementation. The test cases are more like the canary in the coal mine. A failure warrants investigations but an all green is not a guarantee of success.
The compromise is worth it in traditional coding, because someone will care about the implementation. The test cases are more like the canary in the coal mine. A failure warrants investigations but an all green is not a guarantee of success.