|
|
|
|
|
by sethaurus
1231 days ago
|
|
With current models, it's often possible to exfiltrate the special token by asking the AI to repeat back its own input — and perhaps asking it to encode or paraphrase the input in a particular way, so as not to be stripped. This may just be an artifact of current implementations, or it may be a hard problem for LLMs in general. |
|