Letting Claude get at the source code to try to find CVEs. I found it particularly entertaining that after finding none it just devolved to a grep for "strcat."
Oh, I see. No, you're wrong. That's absolutely not what it did and not at all an accurate way to sum up what it found.
This isn't a complete rebuttal to your argument but I'll note with irony that we're commenting on a thread about a FreeBSD kernel remote that Claude both found and wrote a reliable exploit for (though people will come out of the woodwork to say that reliable exploitation of FreeBSD kernel remotes isn't much of a flex).
Here, from the exact tranche of vulnerabilities you're saying was just a "grep for strcat", are the Firefox findings:
We're getting to a point, like we did with coding agents last year, where you can just say "I believe my lying eyes". Check out a repository and do Carlini's "foreach FILE in $(sourcefiles); <run claude -p and just ask for zero days starting from that file>". I did last night, and my current dilemma is how obligated I am to report findings.
It's from the link I posted. Claude's own team in January trying to do exactly what you suggested and ending with results that are less than promising. It's their blog. I assumed it represented the pinnacle of their research.
We're getting a point where anecdotes are being used in place of reason. I'd think you want to ask "how many bug bounties are earned by humans vs AI assistants?" If there's money to be made in finding 0-days then shouldn't there be ample evidence of this?
No. I can't. That's the point. You've not disclosed what you've done, the link you provided contains locked disclosures I can't access but which appear all to be submitted by humans, and the article itself contains a giant problem, it didn't discover anything, it merely crafted a POC from an existing CVE.
Which is why I'm confused. A limited number of particular people say there's this giant sea change. I cannot find any hard evidence that's true.
If anthropic blog was trying to _sell me_ on their service they failed miserably. So I guess my assumption can, at least, safely be, they have no idea how to market their own product.
The Firefox team has acknowledged the vulnerabilities, which are obviously not "greps for strcat" as you claimed. I mean, you've been refuted; I don't really understand what the argument is supposed to be at this point.