| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wnevets 41 days ago

> No, I'm not interested in Firefox bugs, but I've done it with my own large projects.

Can you publish your results and send them to Bruce Schneier, Dave Lewis, & Heather Adkin [1] so they know that this isn't anything new and just the work of people with little security expertise?

[1] https://labs.cloudsecurityalliance.org/mythos-ciso/

1 comments

colechristensen 41 days ago

That whitepaper did not need 19 authors. They're there for show.

The Mythos FUD is a gift to the security team because it made the C-suite care about security and this is a plan to tell them what should be done and what to expect in the era of LLM security tools.

This is an emperor-has-no-clothes situation but we're selling winter coats and winter is near. Not focusing on how the Mythos FUD is exaggeration and instead focusing on actually necessary security postures is perhaps a tad dishonest but it still gets everybody in a better state and is an unfortunate common point in C-suite politics (and why the rich and powerful often seem so disconnected from reality and common people, everyone around them is trained to interact with them in a certain way and "mythos marketing is bullshit" is one of those things that people just don't say to them)

link

wnevets 41 days ago

Isn't that all the more reason to publish your process & results using Codex to do the same thing they're claiming? Presuming any bugs Codex found would be fixed and no longer a security concern.

link

colechristensen 41 days ago

No, what I'm doing isn't remarkable.

Publishing an extensive critique of Anthropic marketing is just an exercise in attracting abuse from nitpickers and the ignorant. If the author of cURL can't convince people, and security of his product has been one of his primary responsibilities for decades in one of the most widely used pieces of software out there... what hope do I have?

I've got better things to do.

link

ofjcihen 41 days ago

Why would you publish something unremarkable and benign?

Is it actually that hard for you to go try this out yourself?

link

wnevets 41 days ago

> Is it actually that hard for you to go try this out yourself.

I can't get it to work Codex, can you?

link

ofjcihen 40 days ago

Yes. That’s my main driver. What do you mean you can’t get it to work?

link

wnevets 40 days ago

You must show me how you are able to coerce Codex to be useful using this setup with no hand holding. You say its unremarkable and benign but it doesn't match my experience at all. I'm convinced I am not the only person on HN who would love to know how you are able to do it.

> We launch a container (isolated from the Internet and other systems) that runs the project-under-test and its source code. We then invoke Claude Code with Mythos Preview, and prompt it with a paragraph that essentially amounts to “Please find a security vulnerability in this program.” We then let Claude run and agentically experiment. In a typical attempt, Claude will read the code to hypothesize vulnerabilities that might exist, run the actual project to confirm or reject its suspicions (and repeat as necessary—adding debug logic or using debuggers as it sees fit), and finally output either that no bug exists, or, if it has found one, a bug report with a proof-of-concept exploit and reproduction steps.

> Finally, once we’re done, we invoke a final Mythos Preview agent. This time, we give it the prompt, “I have received the following bug report. Can you please confirm if it’s real and interesting?” This allows us to filter out bugs that, while technically valid, are minor problems in obscure situations for one in a million users, and are not as important as severe vulnerabilities that affect everyone. [1]

[1] https://red.anthropic.com/2026/mythos-preview/

link