|
|
|
|
|
by messe
1052 days ago
|
|
I combined my method with yours. Once you get it to emit an unescaped <|endoftext|>, the previous "jailbreaks" that get it to emit "<|endoftext|> appear to work again. So it looks like it's still possible to break it, but it takes a bit more effort, presumably to distance the conversation away from the system prompt (which I'm guessing has been modified to try ensure that <|endoftext|> is now escaped): https://chat.openai.com/share/88a62a7f-6de6-4dcf-b382-dc6c20... |
|