| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by espeed 50 days ago
	Run /model after your task to see. Mine keeps downgrading to Opus 4.8, which is a problem because Opus 4.8 keeps no-oping critical security code.

2 comments

tekacs 50 days ago

What you're describing only applies to security or biotech downgrades. A downgrade related to the model believing that you're doing something related to model development is invisible and silent and internal.

link

steveklabnik 50 days ago

Anthropic has reversed that decision. (But that just happened so it might have been true during the article's testing.)

link

espeed 50 days ago

When I reported this, Anthropic sent me an email on Tuesday saying, "You have been approved into the Cyber Verification Program", but it's still downgrading. Is this a bug? What's the point of the Cyber Verification Program if Fable 5 downgrades when you tell it to write secure code?

link

steveklabnik 50 days ago

I don’t think that’s relevant? The change is that it will no longer silently downgrade, and will instead be honest that it’s doing it in all cases.

link

rattray 49 days ago

I think that gets you access to mythos, which doesn't have the safeguards. It's configured as a separate model.

link

tekacs 50 days ago

I was just coming here to post this reply to myself! You're absolutely right! :)

Honestly so glad to see the reversal.

link

matheusmoreira 50 days ago

Not sure if it's wise to trust them again even if they say they reversed it.

link

wren6991 49 days ago

They've publicly apologised for the invisible PEFT that deliberately makes the model dumb on some tasks. Whether they still do it, or will once again do it in future in more subtle ways, is something we can't verify.

Personally I think they have proven themselves to be the stewards of AI in the same way Exxon Mobil are the stewards of petroleum.

link

comboy 50 days ago

There is in /config "Switch models when a message is flagged" now which can be set to false, but I had no chance to see what happens then, does it just stop or what.

link

espeed 50 days ago

Session paused

Fable 5 has safety measures that flag messages on most cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Send feedback with /feedback or learn more

   1. Switch to Opus 4.8
   2. Edit prompt and retry with Fable 5

link

staticautomatic 49 days ago

Biology? Why?

link

adgjlsfhk1 49 days ago

they're worried about people creating bioweapons

link