Hacker News new | ask | show | jobs
by apexalpha 33 days ago
> An amazingly successful marketing stunt for sure.

This. Well done by Antropic.

It even reached the CISO of my small semi-government org in the Netherlands, who slightly panicked at the announced 'tsunami' of vulnerabilities that was coming with Mythos.

Got us some more money and priority with the board, though.

Never waste a good marketing scare.

5 comments

I don't agree with the "no tsunami in sight": if you don't look at 100+ bugs in Firefox and many more OSS projects, bunch of old unseen-before OpenBSD/Linux RCEs, and a few LPE in just 2 or 3 weeks for Linux itself...

IMO, this does not sound like marketing scare, there is spike of vulnerability disclosures - high quality, low false positives - that can be sensed... It feels like we're speedrunning through few-years worth of high quality bug reports in just a few weeks.

Mythos isn’t released yet.

Anthropic noticed the trend of AI vulnerability scanning and started advertising Mythos, which is unreleased, as being very good at it.

Then they donated very large token budgets for using Mythos privately to several teams. Those teams used the free token spend for security research (that was the deal) and anything they found got attributed to Mythos, not the token budget.

Mythos looks like a good incremental model but the PR team has done a great job of associating themselves with the current trend. So much so that comments like yours already associated vulnerabilities found with this model which isn’t even available yet

Mythos hasn't been released yet, but there seems to be some evidence that GPT-5.5, which has been released, is already a touch better anyhow in some dimensions: https://www.mindstudio.ai/blog/gpt-5-5-vs-claude-mythos-cybe...

Close enough that you can probably get a good sense of Mythos' performance by using GPT-5.5.

One thing I noticed while using GPT-5.5 for this is that the ability of the model to turn the bug into an outright vulnerability is less relevant than you might intuitively think. All that is really necessary is for the model to point out that something is smelly, and you should just fix it. Turning it into a runnable exploit has very limited utility for the defender. It does turn heads and may get the attention of some otherwise reluctant people, but everything I found was obviously enough wrong that the exploit was just decorative.

An actual PoC is often very helpful in prioritizing getting the bug fixed, in demonstrating that the bug is real, and in providing something that devs can see happening in their debuggers.
The LPEs were not found with Mythos but with existing, publicly available models.
And also: they did an earlier run with Opus to discover bugs (like segfaults).

In February, Opus discovered a whole bunch of security related bugs, but didn’t exploit them.

Mythos, in turn, was fed these bugs and told to exploit them.

Not saying it’s not impressive, but it was literally told “here are all the places our metal detector says there may be gold, please find gold”.

There is a significant difference between being able to see one flaw and being able to chain together multiple disparate flaws, to be fair.
> bunch of old unseen-before OpenBSD/Linux RCEs,

AFAIK, the only thing it found in OpenBSD was a DoS?

Edit: For that matter, I'm not aware of RCEs in Linux, only LPE?

The whole thing started with a talk from Nicholas Carlini mentioning a remote 20+ year old NFS vuln IIRC.
Anthropic has is quickly destroying customer goodwill by repeatedly pulling the same stunt. Horrible marketing, imho.

It's an entirely different thing to have the company conduct research on LLMs in general being a cybersecurity threat, instead of going "our new model is just too powerful" and shift the discussion to revolve around that. It's slimey.

Hasn't almost every new frontier model had an early period of limited access? I don't get why everyone is acting like Mythos is particularly egregious for this.
This is literally how they announced the model:

> We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity.

> Claude Mythos Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.

https://www.anthropic.com/glasswing

It is called "Mythos" dude...do you have any idea how mysterious and scary this sounds to most people and how much hype that alone can generate.

If the model was calle "Mini Mouse" it wouldn't feel anywhere near as threatening and interesting.

It sounds like the name of a cologne from the 70s or something and I like it.

The bar has become so low lately that no one will care.
He describes in detail how curl is software-engineered to within an inch of its life. Do you really think most code is that highly polished?
Well done for convincing people of something that isn't true, in other words, well done for lying? Is this what is being cheered on? Seriously.
org head is smart.