|
|
|
|
|
by jdw64
4 hours ago
|
|
If you look at the history of software engineering, the ones that made the most money were usually not the companies that built the applications themselves, but the ones that built the tools to verify, deploy, and build them, such as CI/CD, static analysis tools, and testing frameworks. Personally, I agree with the Goodhart problem, but isn't the reason Eval startups fail because they try to sell an 'evaluation service' rather than a 'verification toolchain'? The problem, it seems, is that AI verification toolchains require a model in the end, because they internalize AI and sell it under the name of a 'harness.' So an AI verification(eval) toolchain would have to be structurally different. Verifying AI code isn't about whether it compiles. AI code can always be made to compile. The issue involves various semantic criticisms, such as overfitting to existing designs and tests. To catch those issues, you ultimately need to build an AI. But building that AI is expensive. So in the end, AI verification companies depend on external model providers for the core components of their verification engine. I think this is a bad business decision |
|
> built the tools to verify, deploy, and build them, such as CI/CD, static analysis tools, and testing frameworks.
Curious. Which company made money with testing frameworks?