The world is so not ready for the impact of LLMs on security issues. If true, congrats to the Calif team. It’s likely too technical for me to understand in details but looking forward to reading the 55 pages report
> The world is so not ready for the impact of LLMs on security issues.
I agree, but it's the people I'm worried about.
I'm hearing anecdotes from all over about devs pushing LLM-generated code changes into production without retaining any knowledge of what it is they're pushing. The changes compound, their understanding of the codebase diminishes, and so the actions become risker.
What's worse is a lot of this behavior is being driven by leaders, whether directly (e.g. unrealistic velocity goals, promoting people based on hand-wavy "use AI" initiatives, etc) or indirectly (e.g. layoffs overloading remaining devs, putting inexperienced devs in senior rolls, etc).
The world's gone mad and large swaths of the industry seem hellbent on rediscovering the security basics the hard way.
The gamble is that you can cruise on the senior engineer’s diminishing understanding for a few years until models become good enough that you don’t need any humans in the loop and you can fire all those expensive seniors.
The tragedy is having a bunch of those senior engineers writing blog posts and what not of how productive they are, without realising that it means business now needs less of them.
I suppose that if you don’t believe that models will be good enough to work completely without senior engineer help, positioning yourself as a master prompter is a good move to improve your chances of not getting fired.
If all you have are being good at prompting you are gone. Business is going to prefer new grads who have taken some class in ai prompting (which many schools now offer). Doesn't matter if the class in ai prompting isn't any good. What matters is the idea that they had a formal training in this thing, and that they are willing to work for far less pay than any senior.
>I'm hearing anecdotes from all over about devs pushing LLM-generated code changes into production without retaining any knowledge of what it is they're pushing. The changes compound, their understanding of the codebase diminishes, and so the actions become risker.
The difference is twofold. First, junior devs who ask for code reviews on massive, 2000+ line diffs get coached, and eventually fired if they persist at it. And second, even the most prolific junior engineer would take years to write what Claude is capable of generating in an afternoon.
When Sundar Pichai announces that 75% of all new code at Google is AI-generated, their stock price goes up. If he were to announce that 75% of all new code at Google is now written by junior engineers, this would trigger a massive sell-off and a lot of employees would resign.
The dangers of technical debt and the importance of mitigating it have been known for a long time. Unfortunately a lot of entities now ignore all experience and best practices as soon as you say the "AI" buzzword.
> I'm hearing anecdotes from all over about devs pushing LLM-generated code changes into production without retaining any knowledge of what it is they're pushing. The changes compound, their understanding of the codebase diminishes, and so the actions become riskier.
I don’t think so.
An LLM can produce higher-quality documentation than most humans. If it's not already happening, when a new developer joins a team, they're going to have an LLM produce any documentation a new developer needs, including why certain decisions were made.
It could also summarize years of email threads and code reviews that, let's face it, a new person wouldn’t be able to ingest anyway; it's not like a new developer gets to take a week off to get caught up on everything that happened before they got there. English not their first language? Well, the LLM can present the information in virtually any language required.
As the models continue to improve, they'll spot patterns in the code that a human wouldn’t be able to see.
> An LLM can produce higher-quality documentation than most humans.
Can bears some heavy weight.
LLM generated documentation has so low level of information density, that it’s useless. Yes, it writes nice sentences… or even writes. But it contains so much noise that currently, reading code is a better documentation than what I’ve seen from every single LLM generated documentation.
The same with LLM generated articles. I close them after the second sentence because at least about 90% of it is useless filler.
I almost closed it when I read the first few sentences because these kinds of articles are useless time wasting nonsenses. But this was different. This was old. Most sentences contained something new. Something worthy. (Of course, people also write unnecessary long articles… looking at you Atlantic)
You can throw out almost everything by volume from LLM generated documentation without loosing any information.
Currently, if I smell (and it’s very easy to smell) LLM generated documentation or article, then I close it immediately, because it’s good for only one thing: wasting my time, for no good reason.
> LLM generated documentation has so low level of information density, that it’s useless. Yes, it writes nice sentences… or even writes. But it contains so much noise that currently, reading code is a better documentation than what I’ve seen from every single LLM generated documentation.
I should clarify: the documentation I’m talking about is not generated using a generic LLM prompt, which would mostly suck.
With the proper context and additions (skills, plugins, MCPs) LLMs can produce high-quality documentation. You'd also have subagents doing QA of the documentation.
If stuff really goes wrong, you need people who deeply understand the codebase so that they know where to look and how to diagnose the issue. It might be the case in the future that LLMs become so powerful they'll diagnose any issue (I doubt it), but until then, we need people in the loop.
While maybe true, it is better to back that up with data and the data I know of and read yearly is mostly not great. Between Splunk and SANS surveys of 2025 maybe ~2000 companies have a SOC. [1] [2]
Then you have the many companies in the UK, US, Canada, EU that have compliance and regulatory laws that require them to exist in some capacity in house. Though that is changing with MDR services, but someone still has to interface with the MDR.
That is actually unfair. Most companys spend enormous amounts on security with vast armys of security employees. Not that it is effective, but it is not for lack of resources or trying.
I mean we are literally in a thread about how the 4 trillion dollar company, literally the 3rd most valuable company in the world, with a core competency in software has, yet again, released a core product riddled with security defects for the 50th year in a row.
Commercial IT security is a industry that is incapable to a fault and has, so far, faced basically zero consequences for it.
For every Apple, there are 100 mom-and-pop companies who have nothing.
Even more so in the future when a software company can be launched by a farm of AI Agents with a founder at helm with no clue about computing or security.
What's debateable is how many of those companies actually need irontight security, because they are never realistically going to be targets of criminals and/or they have nothing valuable to steal/corrupt in the first place (other than the owner's pride).
I was pointing out how even Apple, a entity who by all rights should have top-notch security, is still absolutely hopeless in the face of commonplace commercial, profit-motivated attackers.
Massive, extremely well-resourced divisions supported by management in a technically competent organization that is actually trying to solve the problem struggle to produce at best middling security that is inadequate against commonplace threats. This is not a prioritization problem; even if you do “everything right” you are still vulnerable to run-of-the-mill commercial attackers. This is a fundamental capability problem, like how we can not make a net positive fusion reactor right now.
It is actually unfair to blame these companies for not having a fusion reactor because they “were not trying hard enough”. Actual security is not a easy problem, and it is a great disservice to portray it as one that is only unsolved due to dunderheads being in charge since it leads to underestimating what actually needs to be done.
That is not to say that you can not do dramatically worse than the “gold standard” and also that most organizations are actually incompetent; but the “gold standard” is still objectively grossly inadequate. You need to be dramatically better than the 4 trillion dollar software company to reach adequate against prevailing threats.
They have a website that can be used to host malware and/or seo link farms.
I still have nightmares about the contact form on my low-stakes personal website getting hijacked to use as a spam sender (because I used unsanitized input in mail headers).
Hey now, when Apple products get a serious Kernel level vulnerability that is able to be executed just by browsing a website. It's a "jailbreak" not an "exploit".
Not at all. I’m considering that the amount of vulnerable software in the wild is very, very large, with most organizations not managing their systems properly. Imagine all the small to medium size companies that do not have budgets for a dedicated, talented security team. And all the software that will never be patched. We are at the beginning of the exponential
> I’m considering that the amount of vulnerable software in the wild is very, very large
I'd imagine this set is very similar to just "the set of software on the world". Even before the AI stuff, it was a pretty good bet at any given software had some vulnerability; it was just a question of how easy to was to find it.
Yes, that’s my point. Look at how fast the Calif team tackled that macOS issue. Against the top company in the world. One week from bug to exploit. In 2-5 years things will be really wild for everybody out there. We released a technology that make it possible to design extremely complex exploits at a scale we never had to face before. What does that mean if you’re not the top company? Things will be really bad
It makes you think will everything need to be rewritten from the ground up - potentially by AI itself, or AI having a very heavy hand in validating all of it.
There's so much much lower hanging fruit. Every job I've had has had basically everything massively out of date. Just keeping packages and framework versions up to date is a full time job and none of these companies have someone assigned to doing it.
So much out of date software with known exploits left running for years. The only reason there hasn't been total disaster is no one has tried to hack it yet.
Yes, exactly, that’s the main change. And not just in a script kiddy way. What we see now is LLM + experts can develop extremely complex exploit chains in no time. It’s one thing to exploit a known vulnerability that you can patch by upgrading your Wordpress, it’s something else when the attacker is able to completely take over your systems in ways you didn’t even consider was possible and adapt in 1 day to your attempts at patching
I agree, but it's the people I'm worried about.
I'm hearing anecdotes from all over about devs pushing LLM-generated code changes into production without retaining any knowledge of what it is they're pushing. The changes compound, their understanding of the codebase diminishes, and so the actions become risker.
What's worse is a lot of this behavior is being driven by leaders, whether directly (e.g. unrealistic velocity goals, promoting people based on hand-wavy "use AI" initiatives, etc) or indirectly (e.g. layoffs overloading remaining devs, putting inexperienced devs in senior rolls, etc).
The world's gone mad and large swaths of the industry seem hellbent on rediscovering the security basics the hard way.