|
|
|
|
|
by andrewflnr
255 days ago
|
|
My read is that token prediction requires a more general model to predict more varied tokens, which makes it something closer to a world model. After all, in principle, there's a point where the optimal "token predictor" really is backed by a world model. (Now is that model feasible to find? unclear!) |
|