| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tveita 1201 days ago

I think you are overestimating the amount of difference the special tokens make. GPT will pay attention to any part of the text it pleases. You can try to train it to differentiate between the system and user input, but ultimately it just predicts text and there is no known way to prevent user input from getting it into arbitrary prediction states. This is inherent in the model.

Note carefully the wording in the documentation, which describes how to insert the special tokens:

> Note that ChatML makes explicit to the model the source of each piece of text, and particularly shows the boundary between human and AI text. This gives an opportunity to mitigate and eventually solve injections

There is an "opportunity to mitigate and eventually solve" injections, i.e. eventually someone might partially solve this research problem.