What is happening? I see multiple outages and CVEs is being reported on HN's front page. I've never seen these many security/incident related posts on HN's front page.
Some combination of reporting bias given concerns about LLM security capabilities and actual new vulnerabilities found with LLM assistance. Even if exploits and outages are unrelated to LLMs, I'm certainly thinking about whether claude could build these things (or if actors already have).
Slowly at first, and then suddenly. AI assisted anything follows this trend. As capabilities improve, new avenues become "good enough" to automate. Today is security.
i believe a good portion of the cves hitting the front page are moreso because they are ai-related (found partially/in whole by ai) and make for quick upvotes.
I would caution against thinking it's difficult for an LLM. I've used them in raw data file analysis and they are frequently shockingly good at pulling structures and meaning out of seemingly random data. Disassembled binaries already are structured, so pulling code flow out of that is easier. Mixing that with existing disassembly and inspection tooling and an LLM has what is needed to fast track this kind of vulnerability research. Point being, an LLM with the proper tools can potentially follow code flow from disassembled binaries way easier than a human.
I forgot who it was, but someone on YouTube said LLMs already work hooked up to gidra. If true it's only a matter of time once they find similar things in e.g. Windows. I'll wait half a year to a year (think of embargo) and if there still isn't such work for Windows I'll conclude that LLMs have a problem disassembling binaries.
Anyone care to share which models and which prompts actually lead to finding these kinds of vulnerabilities? Or the narrowing-down workflow that can get an LLM to discover them? Surely just telling claude "Find all vulnerabilities in this project LOL" isn't enough? I hope?
Everyone was talking about how Mythos was overblown marketing, and while it may be, they missed the forest for the trees. Capabilities have been escalating for a year now and we're at the point of widespread impact. I don't suspect we'll see a slowdown for a long time.
I agree. It is not like Mythos or other LLMs are insanely smart/superhuman. Many of these vulnerabilities could be discovered fairly easily by trained human experts as well. The problem is more that it requires an insane amount of attention and time of highly-paid experts to shake out these issues vs. an LLM that never gets tired and can analyze a large amount of code at low cost.
Linus' law was wrong because there were never enough (qualified) eyeballs to check the code. LLMs provide an ample supply of eyeballs (though it's not a benefit to open source, since proprietary developers can use the same LLMs).
Same applies to them being good enough to program, but many are so focused on source code generation that they don't get the whole picture.
Thanks to agents and tool calling, there are now business cases that can be fully described by AI tooling, the next step in microservices, serverless and what not.
Naturally with a much smaller team than what was required previously.
AI assistance was explicitly disclosed on yesterday's. Today's has Claude as one of two contributors on this GitHub Pages site at least so it's also very likely.
Agents are capable of finding this kind of stuff now and people are having a field day using them to find high-profile CVEs for fun or profit.
Yes I think people forget that cyber-war between West and East is very active, with a significant amount of attacks being committed by nation states or state-sponsored groups.