|
|
|
|
|
by shaism
766 days ago
|
|
Fundamentally, the pre-trained model would need to learn a "world model" to predict well in distinct domains. This should be possible not regarding compute requirements and the exact architecture. After all, the physical world (down to the subatomic level) is governed by physical laws. Ilya Sutskever from OpenAI stated that next-token prediction might be enough to learn a world model (see [1]). That would imply that a model learns a "world model" indirectly, which is even more unrealistic than learning the world model directly through pre-training on time-series data. [1] https://www.youtube.com/watch?v=YEUclZdj_Sc |
|