| > And in order to predict the next token well they have to build world models This is not true. Look at gpt2 or Bert. A world model is not a requirement for next token prediction in general. > This has been proven One white paper with data that _suggests_ the author’s hypothesis is far from proof. That paper doesn’t show creation of a “world model” just parts of the model that seem correlated to higher level ideas not specifically trained on. There’s also no evidence that the LLM makes heavy use of those sections during inference as pointed out at the start of section 5 of that same paper. Let me see how reproducible this is across many different LLMs as well… > In other words, you don't know that you are not just a fancy next token predictor. “You can’t prove that you’re NOT just a guessing machine” This is a tired stochastic parrot argument that I don’t feel like engaging again, sorry. Talking about unfalsifiable traits of human existence is not productive.
But the stochastic parrot argument doesn’t hold up to scrutiny. |
Conjecture. Maybe they all have world models, they're just worse world models. There is no threshold beyond which something is or is not a world model, there is a continuum of models of varying degrees of accuracy. No human has ever had a perfectly accurate world model either.
> One white paper with data that _suggests_ the author’s hypothesis is far from proof.
This is far from the only paper.
> This is a tired stochastic parrot argument that I don’t feel like engaging again, sorry.
Much like your tired stochastic parrot argument about LLMs.