| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wcoenen 155 days ago
	When LLMs process tokens, each token is first converted to an embedding vector. (This token to vectors mapping is learned during training.) Since a token itself carries no information about whether it has "authority" or not, I'm proposing to inject this information in a reserved number in that embedding vector. This needs to be done both during post-training and inference. Think of it as adding color or flavor to a token, so that it is always very clear to the LLM what comes from the system prompt, what comes from the user, and what is random data.

1 comments

jcgl 155 days ago

This is really insightful, thanks. I hadn't understood that there was room in the vector space that you could reserve for such purposes.

The response from tempaccsoz5 seems apt then, since this injection is performed/learned during post-training; in order to be watertight, it needs to overfit.

link