| Organizations now generate 10x the amount of code, because everyone can do it. But we have exactly the same number of reviewers. How the heck are we gonna deal with it when we cannot use LLMs for sanity checking LLM code? Like literally yesterday I had a not-technical person who used codex to build an optimization algorithm, and due to the momentum it gained I was asked to “fix the rough edges and help with scaling”. The entire thing was trash (was trying to do naive search in a combinatorial problem with 1000s of integers, and was violating constraints with high probability, including the integrality). I had to spend all my day reviewing it and make a technical presentation to their leadership that it is just a polished turd. |
Unit testing. LLM's are very good at writing tests and writing code that is testable (as long as you ask it), and if you just check that the tests are actually calling the code and doing so with all the obvious edge cases and that the results are correct, that's actually quite fast to review -- faster than reviewing the code.
And you can include things like performance testing in tests as well.
We're moving to a world where we work with definitions and tests and are less concerned with the precise details of how code is written within functions. Which is a big shift in mindset.