| I only skimmed this. Lots of “not to be read, but to be verified; process, not the artifact; not x but...”. “AI-checks-AI pipelines as first-class CI infrastructure, not bolt-on curiosity”—what’s the contrast here? Is it serious aspiration, not unserious aspiration? “Formal specification layers that agents execute against, not just prompts”—Okay. It just looks like it is stating lots of problems with a x-not-y as if there is progress being made by way of insistence. I am open to the idea of something like a small verification kernel that can be comprehended by “humans” which can check GenAI output. But right now we can contrast mature (decade+) compilers with GenAI like this. - Compilers: You get the abstraction you asked for: it might not be “optimal” code, but it is code that works the way you wrote it - GenAI: Here is 200KLOC, good luck, could be anything Now you could reduce the space of those 200KLOC with tests and verification. But so far (based on this submission) it looks like this is at the handwaving stage. Certainly you would need high-value tests if tests are the thing that is supposed to be the verification. Either something simple and expressive enough for “humans” to write or something that is both short and easy to read for “humans” (and generated by GenAI). Not some copy-paste smelling mockfest that looks like it is a pile of junk that has evolved over five years, each author pushing some junk on top while taking care to not make the whole pile tilt and collapse. |