|
|
|
|
|
by lsy
377 days ago
|
|
I think people hear "next token prediction" and think someone is saying the prediction is simple or linear, and then argue there is a possibility of "intelligence" because the prediction is complex and has some level of indirection or multiple-token-ahead planning baked into the next token. But the thrust of the critique of next-token prediction or stochastic output is that there isn't "intelligence" because the output is based purely on syntactic relations between words, not on conceptualizing via a world model built through experience, and then using language as an abstraction to describe the world. To the computer there is nothing outside tokens and their interrelations, but for people language is just a tool with which to describe the world with which we expect "intelligences" to cope. Which is what this article is examining. |
|
LLMs model concepts internally and this has been demonstrated empirically many times over the years, including recently by anthropic (again). Of course, that won't stop people from repeating it ad nauseum.