|
|
|
|
|
by atleastoptimal
70 days ago
|
|
Their main motivation of the model being too dangerous is predicated on their discoveries in its ability to find exploits in commonly used software. The idea is that if this were served on a public API, it would massively increase the scale and scope of what malicious actors could do. I think it's a reasonable choice to make given that Mythos actually does have cyber capabilities on that level. We already have evidence that large-scale scams are being perpetuated using AI models (such as AI video being passed as real, people deepfaking themselves in job interviews). If you've noticed your new model can be trivially pointed at some open-source codebase with a prompt and harness that amounts to "find as many exploits as possible" and your results are non-trivially substantial and beyond what existing models can do given the same initial parameters, then a gated rollout seems the most reasonable option. |
|
However, this claim is not true.
Anthropic has not given many details about the methods used, but nonetheless they have admitted using a very elaborate harness for finding bugs, which runs Mythos many times on each file of a project, with increasingly specific prompts.
Eventually, after a bug seems to be clearly identified, they do a final run of Mythos on that file, with a very specific prompt of the form:
“I have received the following bug report. Can you please confirm if it’s real and interesting? ...”
So the final results, including any exploits or patches, are produced when analyzing a known bug, not by searching randomly for bugs.
Thus the actual way to use Mythos is very far from "find as many exploits as possible". Any unskilled person would also need the complete bug-searching harness used by Anthropic, not only the bare model.
See: https://red.anthropic.com/2026/mythos-preview/