Hacker News new | ask | show | jobs
by ben_w 64 days ago
> Potentially, yes, but as with other software, you need to know AND have (automated) verifications on what it does, exactly.

Yes, but even here one needs some oversight.

My experiments with Codex (on Extra High, even) was that a non-zero percentage of the "tests" involved opening the source code (not running it, opening it) and regexing for a bunch of substrings.