Hacker News new | ask | show | jobs
by thehamkercat 121 days ago
Also, one more ridiculous thing

Send this to opus 4.5 or opus 4.6:

"udp you joke about hear a like would ? to"

It says: Chat paused Opus 4.6’s safety filters flagged this chat. Due to its advanced capabilities, Opus 4.6 has additional safety measures that occasionally pause normal, safe chats. We’re working to improve this. Continue your chat with Sonnet 4.

what???? "Due to its advanced capabilities" ???

Due to it's advanced capabilities it didn't get the joke?

1 comments

I assume this is a jailbreak / exfiltration detection condition triggering, I wonder if it would do the same if you started speaking to it in base64