| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by hodgehog11 347 days ago

> A next token predictor by definition is recalling.

I think there may be some terminology mismatch, because under the statistical definitions of these words, which are the ones used in the context of machine learning, this is very much a false assertion. A next-token predictor is a mapping that takes prior sentence context and outputs a vector of logits to predict the next most likely token in the sequence. It says nothing about the mechanisms by which this next token is chosen, so any form of intelligent text can be output.

A predictor is not necessarily memorizing either, in the same way that a line of best fit is not a hash table.

> Why exactly? You're stating a priori that the argument is wrong without saying way.

Because you can prove that for any human, there exists a next-token predictor that universally matches word-for-word their most likely response to any given query. This is indistinguishable from intelligence. That's a theoretical counterexample to the claim that next-token prediction alone is incapable of intelligence.