Hacker News new | ask | show | jobs
by courseofaction 1052 days ago
I saw no strange behaviour with GPT-4

https://chat.openai.com/share/cd2eb525-2625-4166-8121-974408...

2 comments

See the last couple of messages in my edit3: https://chat.openai.com/share/8ddb50ca-abf8-4f70-ac2d-521764...

It appears to use escaped <> sometimes. Asking it to not escape breaks is when things break.

I combined my method with yours. Once you get it to emit an unescaped <|endoftext|>, the previous "jailbreaks" that get it to emit "<|endoftext|> appear to work again.

So it looks like it's still possible to break it, but it takes a bit more effort, presumably to distance the conversation away from the system prompt (which I'm guessing has been modified to try ensure that <|endoftext|> is now escaped):

https://chat.openai.com/share/88a62a7f-6de6-4dcf-b382-dc6c20...

As I mentioned in another comment, both of them stopped working reliably for me as well.