Hacker News new | ask | show | jobs
by andrewflnr 255 days ago
My read is that token prediction requires a more general model to predict more varied tokens, which makes it something closer to a world model. After all, in principle, there's a point where the optimal "token predictor" really is backed by a world model. (Now is that model feasible to find? unclear!)