Hacker News new | ask | show | jobs
by traceroute66 30 days ago
> I thought Mythos was just a bunch of hype?

My opinion is that it is over-hyped because like any LLM, it requires a suitable human in the loop to keep the LLM on the straight and narrow, and then to weed through the inevitable false-positives and hallucinations.

Nicholas Carlini, for example, whose name is on many of the recent high-profile Mythos findings is not just some random dude with a Claude sub on his credit card .... he's an experienced security researcher.

Random inexperienced people thinking Mythos can replace the need for experienced pen-testers, auditors etc. are likely to be sorely disappointed if/when they get their hands on Mythos.

5 comments

> Nicholas Carlini, for example, whose name is on many of the recent high-profile Mythos findings is not just some random dude with a Claude sub on his credit card .... he's an experienced security researcher.

I don’t think Mythos is hype for all kinds of reasons.

Anthropic is a young company but their track record is solid; they don’t seem to hype things just for the sake of hyping things. Sam Altman at OpenAI? We already know his track record…

I’m going Occam’s razor here: the simplest explanation is usually the correct one.

Anthropic had an “oh shit” moment when they realized what Mythos can do. They decided to do the responsible thing: give the industry a heads-up and an opportunity to use the preview to identify and fix the most dangerous zero-day vulnerabilities.

Since the FAANG companies have billions of users, it makes sense to start with them.

There’s still going to major issues for users of systems too old to get patches or updates. Or for IT organizations who think Mythos is a replay of Y2K, where, compared to the warnings, not lot happened.

The bottom line is someone with Mythos won’t need to be an experienced security expert to cause real problems. That’s kind of the point.

> replay of Y2K, where, compared to the warnings, not lot happened

My dad was on one of the many Y2K teams that major tech companies had to make sure nothing went wrong. I feel like history may have undersold what could've been if not for considerable effort leading up to Jan 1, 2000.

I think it's worth to look at the recent XBOW benchmark: https://xbow.com/blog/mythos-offensive-security-xbow-evaluat... they realized that ChatGPT 5.5 works better so the secret is in the architecture (including humans in the loop).
'frontier tokens are not fungible'
> it is over-hyped because like any LLM, it requires a suitable human in the loop to keep the LLM on the straight and narrow, and then to weed through the inevitable false-positives and hallucinations.

"Suitable human" is a dry phrase indeed. ^_^

The hype is "gosh look at all the bad things this brilliant almost conscious tool found!"

The reality: an insecure toolchain for an insecure language with an insecure compiler produced a runnable but insecure binary for an insecure OS. We couldn't be arsed to address any of this before, but now we're being billed the full price of our laziness.

Yeah, I was thinking earlier, the way things are going, software (and maybe the internet itself) might need to look a little different in a few years.

Ironically the AIs will probably help us produce higher quality software in the end, because "everything gets pwned" becomes the forcing function for software actually being correct.

In other words I think we are actually entering an age where correctness makes economic sense. (One can dream!) The cost of producing correctness is dropping, and the cost of not doing so is rising massively.

Over time that will change. Technology has proven time and time again that as we add a new layer of abstraction over the fundamental functionality, knowing the previous layer quickly becomes vestigial knowledge. It is true not just in software but absolutely all technology there is, going back to the first fire made or atl atl or rock sling.
> likely to be sorely disappointed if/when they get their hands on Mythos.

At first they will be delighted. So much money and time saved. When their adversaries get their hands on their system (with or without Mythos), then they'll be sorely disappointed.