|
|
|
|
|
by UltraSane
359 days ago
|
|
This paper argues the opposite https://arxiv.org/abs/2506.01622 Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents. |
|
On my reading, the philosophical claim is that these models do not develop an actual logical, internal representation of domains.
The functional import is whether or not they are able to realize specific behaviors within a domain. The paper argues that a markov process can realize the functional equivalence of the initial goal oriented picture of its domain—that is can solve goals with an error bound—but not that it develops an actual representation of the domain.
Lack of an actual representation prevents such a machine from doing other things. For example, iiuc, it would be unable to solve problems in domains that are homomorphic to the original, while an explicit representation does enable this.