|
|
|
|
|
by extr
348 days ago
|
|
I find Gary's arguments increasingly semantic and unconvincing. He lists several examples of how LLMs "fail to build a world model", but his definition of "world model" is an informal hand-wave ("a computational framework that a system (a machine, or a person or other animal) uses to track what is happening in the world"). His examples are lifted from a variety of unclear or obsolete models - what is his opinion of O3? Why doesn't he create or propose a benchmark that researchers could use to measure progress of "world model creation"? What's more, his actual point is unclear. Even if you simply grant, "okay, even SOTA LLMs don't have world models", why do I as a user of these models care? Because the models could be wrong? Yes, I'm aware. Nevertheless, I'm still deriving subtantial personal and professional value from the models as they stand today. |
|
Both statistical data generators and actual reasoning are useful in many circumstances, but there are also circumstances in which thinking that you are doing the latter when you are only doing the former can have severe consequences (example: building a bridge).
If nothing else, his perspective is a counterbalance to what is clearly an extreme hype machine that is doing its utmost to force adoption through overpromising, false advertising, etc. These are bad things even if the tech does actually have some useful applications.
As for benchmarks, if you fundamentally don't believe that stochastic data generation leads to reason as an emergent property, developing a benchmark is pointless. Also, not everyone has to be on the same side. It's clear that Marcus is not a fan of the current wave. Asking him to produce a substantive contribution that would help them continue to achieve their goals is preposterous. This game is highly political too. If you think the people pushing this stuff are less than estimable or morally sound, you wouldn't really want to empower them or give them more ideas.