| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Scene_Cast2 4 hours ago

Really neat findings.

I've personally had a line of thought where you bake in the role into the token. Basically have an embedding (same dim as token dim) for each role, add it to each token. This adds an unambiguous, unspoofable tag.

I ran this with a tiny Shakespeare model (not representative) and had a freeform embedding for each speaker. I ended up with a neat similarity map between every character. (I don't think the map was very informative for several reasons, but that's outside the scope of a small HN comment)

4 comments

dmazzoni 3 hours ago

My initial thought there is that you'd have an imbalance. Many token patterns would almost never come up with the assistant tag on them, for example words with typos in them.

link

ryukafalz 3 hours ago

I don't know a ton about how LLMs work (I really should learn), but something like this feels like it might be the way forward to me.

The software running the model knows unambiguously what came from a user and what did not, what came from a tool call and what did not, etc... and having some way of exposing that to the LLM as part of the text itself feels like it fits better with how a neural net works than a set of surrounding tags does.

link

lelanthran 3 hours ago

> I've personally had a line of thought where you bake in the role into the token. Basically have an embedding (same dim as token dim) for each role, add it to each token. This adds an unambiguous, unspoofable tag.

Wouldn't this require the training data to also be prepped with the control tokens?

link

Scene_Cast2 3 hours ago

Yes it would. Or, rather, labeling (not extra tokens).

link

zahlman 3 hours ago

Of course it would, at least at some point; the model has to… model what it means for a token to be a control token. (And the eventual interface of course has to be secure against end users generating such tokens, but that should be easy enough.)

…This somehow feels like AI scientists rediscovering the concept of parenting.

link

mrob 3 hours ago

You could duplicate every token and reserve the duplicates exclusively for the chain-of-thought, which could be robustly filtered from user input. Basically adding a "thought" bit to each token.

link