Hacker News new | ask | show | jobs
by muldvarp 59 days ago
Manual verification that the "judge" judges correctly.

Also, how exactly do you programmatically validate CVEs?

1 comments

Most open-source CVEs will have a patch linked in their disclosure. You can get vulnerable code via the git diff, then just verify if it is part of the LLM's finding.