Hacker News new | ask | show | jobs
by jerf 42 days ago
Mythos hasn't been released yet, but there seems to be some evidence that GPT-5.5, which has been released, is already a touch better anyhow in some dimensions: https://www.mindstudio.ai/blog/gpt-5-5-vs-claude-mythos-cybe...

Close enough that you can probably get a good sense of Mythos' performance by using GPT-5.5.

One thing I noticed while using GPT-5.5 for this is that the ability of the model to turn the bug into an outright vulnerability is less relevant than you might intuitively think. All that is really necessary is for the model to point out that something is smelly, and you should just fix it. Turning it into a runnable exploit has very limited utility for the defender. It does turn heads and may get the attention of some otherwise reluctant people, but everything I found was obviously enough wrong that the exploit was just decorative.

1 comments

An actual PoC is often very helpful in prioritizing getting the bug fixed, in demonstrating that the bug is real, and in providing something that devs can see happening in their debuggers.