|
|
|
|
|
by wcoenen
155 days ago
|
|
When LLMs process tokens, each token is first converted to an embedding vector. (This token to vectors mapping is learned during training.) Since a token itself carries no information about whether it has "authority" or not, I'm proposing to inject this information in a reserved number in that embedding vector. This needs to be done both during post-training and inference. Think of it as adding color or flavor to a token, so that it is always very clear to the LLM what comes from the system prompt, what comes from the user, and what is random data. |
|
The response from tempaccsoz5 seems apt then, since this injection is performed/learned during post-training; in order to be watertight, it needs to overfit.