| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shaism 813 days ago

Fundamentally, the pre-trained model would need to learn a "world model" to predict well in distinct domains. This should be possible not regarding compute requirements and the exact architecture.

After all, the physical world (down to the subatomic level) is governed by physical laws. Ilya Sutskever from OpenAI stated that next-token prediction might be enough to learn a world model (see [1]). That would imply that a model learns a "world model" indirectly, which is even more unrealistic than learning the world model directly through pre-training on time-series data.

[1] https://www.youtube.com/watch?v=YEUclZdj_Sc

2 comments

whimsicalism 813 days ago

But the data generating process could be literally anything. We are not constrained by physics in any real sense if we predicting financial markets or occurrences of a certain build error or termite behavior.

link

shaism 813 days ago

Sure, there are limits. Not everything is predictable, not even physics. But that is also not the point of such a model. The goal is to forecast across a broad range of use cases that do have underlying laws. Similar to LLM, they could also be fine-tuned.

link

wavemode 812 days ago

"predicting the next token well means that you understand the underlying reality that led to the creation of that token"

People on the AI-hype side of things tend to believe this, but I really fundamentally don't.

It's become a philosophical debate at this point (what does it mean to "understand" something, etc.)

link