| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mccr8 53 days ago
	The basic technique (as has been publicly described by Anthropic) is you ask one agent to come up with a test case that triggers, say, an ASan use-after-free. Then you have a second agent that validates the test case. This eliminates a lot of false positives. It gets a little tricky when you allow the first agent to modify the code, which is necessary for things like sandbox escapes where you want to demonstrate that sending bad IPC causes problems.

1 comments

ygjb 53 days ago

And the following part that is more important is the agent that attempts the fix in code, the agent that tests the fix and reports on perfomance and functional impacts, and the agent the triggers the build and release to production.

Everything up to finding and validating the bug is a huge win in vuln/exploit development, everything after validating the bug is a huge win for defensive security and a massive gap until the tools are generally available :S

link