Hacker News new | ask | show | jobs
How RL Reward Hacking Made Claude Mythos a Zero-Day Hunter (uberdavid.substack.com)
2 points by uberdavid 75 days ago
1 comments

Mythos's research is easily replicated via other top models and even some open source models, so this is nothing new.

This is just like using Fizzers to test programs or auditing code using test cases to find potential issues and then turning that into an exploit.

The system card directly compares to Opus 4.6 and other frontier models on the same evals. Cybench went from ~75% to 100%, Firefox exploitation from 1 bug unreliably to 4 bugs reliably. It's true there are many capable coding models out there, but the post is about why this specific cyber capability jump happened.