Hacker News new | ask | show | jobs
Show HN: We're inviting Anthropic to put the real Mythos 5 on our open benchmark (realvuln.com)
4 points by jfaganel99 5 days ago
4 comments

Just a question on the benchmark. It states that it is on real world code, but all the repos in the dataset are intentionally vulnerable repos right, not real world codebases that have reported vulnerabilities?
Question, because I can't answer it myself...

Created an open-source benchmark for code security scanners and ran a bunch of them along with LLMs on real vulnerable code. Fable 5 is on there also as of yesterday, but that's the gated public model. The one we all wants to see is Mythos 5, and it's locked to a handful of vetted orgs.

So does anyone here have access to Mythos 5? And can run it against the benchmark.

Would genuinely like to see what it scores and at what cost.

I tried to get access to Mythos for our internal security but absolutely no luck!

Seems again like it was all the start of their scare mongering campaign to get some legislative/institutional advantage over the Chinese models.

For the sceptics... The benchmark is research based with a published ArXiv paper on the methodology

https://arxiv.org/abs/2604.13764