| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by andai 3 hours ago

> Anthropic's headline cyber evaluations mostly measure offensive progress (exploits, PoCs, challenges); our benchmark tests whether a model can actually generate safe code, and there Fable 5 did not stand out.

The model isn't allowed to think about security. I heard several people here mention that if it starts thinking about security -- e.g. writing tests related to it -- the safety filter flags it and downgrades to Opus.

So it's actually not allowed to make your code secure.

2 comments

matheusmoreira 1 hour ago

Yeah. Fable apparently found bugs in my C code but Anthropic wouldn't allow it to test them, fix them or even tell me what the problem was. The memory safety parts of my Fable code review were 50% Opus. Even the coordinator Fable that just launched the code review agents got downgraded to Opus for some reason.

Model is definitely better than Opus but Anthropic's delivering a pretty terrible experience.

link

latentsea 2 hours ago

> So it's actually not allowed to make your code secure.

Anything designed to prevent a problem will eventually cause one.

link