| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by messe 1052 days ago

I combined my method with yours. Once you get it to emit an unescaped <|endoftext|>, the previous "jailbreaks" that get it to emit "<|endoftext|> appear to work again.

So it looks like it's still possible to break it, but it takes a bit more effort, presumably to distance the conversation away from the system prompt (which I'm guessing has been modified to try ensure that <|endoftext|> is now escaped):

https://chat.openai.com/share/88a62a7f-6de6-4dcf-b382-dc6c20...