| HN Mirror

  > Doesn't this then support the claim that LLMs aren't building world models

There's actually no strong evidence that LLMs, or any AI system, is actually building a world model.

These systems are determined to have "world model" capabilities based on benchmarks, but benchmarks will never be able to tell you if such a feat is taking place. How people are claiming that these have world models is by testing them for consistency. The thing is that a world model is counterfactual. The problems with benchmarks is that they do not distinguish memorization from generalization. To make things worse, the term "Out of Distribution" (OOD) is rather fuzzy and gets abused quite a bit (I can explain more if anyone wants). Basically you should not trust any claim of "few shot" or "zero shot" and the truth is that no such claim can be made without deep knowledge of the datasets they're trained on. It helps to go back to the original zero shot papers.

One bit that might actually help in understanding things is that a world model does not actually need make correct predictions, which should show a critical flaw in benchmarking these capabilities. You can look to the history of physics and gather many great examples of this. For example, the geocentric model still had predictive powers, was counterfactual, and had a lot of accuracy. It was in fact a world model, despite being wrong. There was legitimate pushback to Galileo, specifically over tides[0]. If you like that kind of stuff I highly recommend the podcast "An Opinionated History of Mathematicas"[1].

There's a lot more complexity and nuance to this, but I'll say that there's a reason we do physics the way we do it. Benchmarks and empirical evidence play a critical role in developing physics theories and confirming those theories. But they also are not enough to build our models. (You'll also find that physicists are common dissenters of the claim of LLMs having world models. Sure, you'll also find the Max Tegmark types, but in general the consensus is against them, and for good reason).

Here's a decent paper showing a model being highly accurate yet failing to create an accurate construction of the environment[2]. The way such a thing can happen is to realize that the task diverges from the necessity to model the world. World modeling is a natural thing for humans and animals to do, because it generalizes exceptionally well, but you need to be careful in evaluating things via benchmarks and to remember that extraordinary claims require extraordinary evidence. I'd say claims of "thinking" or "world modeling" are quite extraordinary claims and we should not be hasty to attribute these characteristics when there are many reasonable and simpler alternative explanations.

[0] https://en.wikipedia.org/wiki/Discourse_on_the_Tides

[1] https://intellectualmathematics.com/opinionated-history-of-m...

[2] https://arxiv.org/abs/2406.03689

[disclosure] I have a PhD in Computer Vision and a BS in physics. I care very much about world modeling as a problem but the response I get from many of my peers is "we just care if it works." It's a concern I too share. It is the reason I ask these questions. It feels quite odd that the motivation for my questions is also used to dismiss them. (FWIW, no physicist nor former physicist has ever responded to be this way)