Hacker News new | ask | show | jobs
by SanderNL 1164 days ago
It reasons with tokens, not words. They can be words, but they can be anything. Visual data can be tokenized and reasoned with.
2 comments

It depends on what you mean by "reason" exactly. The "thinking" parts of the model work with embeddings internally, not tokens. Or at least that's what they get as input; who knows what it becomes inside eventually.

OTOH, the not-really-internal monologue when you tell it to "think it out" loud, which also drastically improves quality of the final answer, is tokens since it has to be marshalled through the context window for the next inferred token.

You are right probably, that's a good point. Even actions can be tokenized.