We Benchmarked Claude Code, Codex, Semgrep, CodeQL, Trent on 28 CWE-Bench CVEs

Y	Hacker News new \| ask \| show \| jobs

	We Benchmarked Claude Code, Codex, Semgrep, CodeQL, Trent on 28 CWE-Bench CVEs (trent.ai)
	6 points by geopsist 20 days ago

2 comments

kbrajesh176 19 days ago

Looks interesting. LLM base solutions fails when metric is strict. For security solution guess is not enough, we need reliable and robust solution to pin vulnerability and its evidence to fully judge and mitigate with appropriate fix.

link

enothereska 20 days ago

I'm co-founder at trent.ai, happy to answer any questions around this.

link