Hacker News new | ask | show | jobs
by stratos123 54 days ago
In terms of quantity, definitely yes (a single person managing a swarm of Opusi can already find much more real bugs than a security researcher, hence the rise in reports).

In terms of quality ("are there bugs that professional humans can't see at any budget but LLMs can?") - it's not very clear, because Opus is still worse than a human specialist, but Mythos might be comparable. We'll just have to wait and see what results Project Glasswing gets.

Either way, cybersecurity is going to get real weird real soon, because even slightly-dumb models can have a large effect if they are cheap and fast enough.

EDIT: Mozilla thinks "no" to the second question, by the way: "Encouragingly, we also haven’t seen any bugs that couldn’t have been found by an elite human researcher.", when talking about the 271 vulnerabilities recently found by Mythos. https://blog.mozilla.org/en/firefox/ai-security-zero-day-vul...

2 comments

> Opusi

The plural of "Opus" is "Opera". Might be a tad confusing tho :)

Opuses is also correct English, and clearer in non-academic contexts.

Opera is the traditional plural from Latin, now perhaps for more scholarly use in English.

Results from a quick search.

I'll do the faux German thing then: Opusen :)
pseudofauxteutönic
Wondered for a second "what does that browser have to do with all this?"
There is also a huge surface area of security problems that can't happen in practice due to how other parts of the code work. A classic example is unsanitized input being used somewhere where untrusted users can't inject any input.

Being flooded with these kind of reports can make the actual real problems harder to see.

They wouldn't be classed as vulnerabilities then, since, you know, there is no vulnerability. Unless you have evidence that most of these issues are unexploitable, but I would be surprised to hear that they were considered vulnerabilities in that case.
I believe the LLM would flag this kind of thing as a potential issue.