| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by staticassertion 147 days ago

My point is that "verifiable testcases" works great for proving "this is vulnerable" but LLMs are still risky if you believe "this is safe", which you can't easily prove. My point is that you need to be very skeptical of when they decide that something isn't vulnerable.

I completely agree that LLMs are great when instructed to provide provable, repeatable exploits. I have done this multiple times and uncovered some neat bugs.

> I can't really confirm the part about "local" bugs anymore though, but that might also be a model thing.

I don't think it's a model thing, it's just a sort of basic limitation of the technology. We shouldn't expect LLMs to perform novel tasks so we shouldn't expect LLMs to find novel vulnerabilities.

Agents help, human in the loop is critical for "injecting novelty" as I put it. The LLM becomes great at producing POCs to test out.