| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fulladder 154 days ago

So glad that I'm not the only one struggling with these huge generated PRs that are too big to honestly review, all while an AI reassuringly whispers in my ear "just trust me."

Don't get me wrong, overall I really like having AI in my workflow and have gotten many benefits. But even when I ask it to check its own work by writing test cases to prove that properties A, B and C hold, I just end up with thousands more lines of unit and integration tests that then take even more time to analyze -- like, what exactly is being tested here?, are the properties these tests purport to prove even the properties that I care about and asked the agent for in the first place, etc.

I have tried (with at least modest success) to use a second or third agent to review the work of the original coding agent(s), but my general finding has been that there is no substitute for actual human understanding from a legitimate domain expert.

Part of my work involves silicon design, which requires a lot of precision and complex timing issues, and I'll add that the best AI success I've had in those cases is a test-first approach (TDD), where I hand write a boatload of testbenches (that's what we call functional tests in chip design land), then coach my various agents to write the Verilog until my `make test` runs with no errors.