|
|
|
|
|
by hahaxdxd123
362 days ago
|
|
Extremely oversold article. > the core insight: predict in representation space, not pixels We've been doing this since 2014? Not only that, others have been doing it at a similar scale. e.g. Nvidia's world foundation models (although those are generative). > zero-shot generalization (aka the money shot) This is easily beaten by flow-matching imitation learning models like what Pi has. > accidentally solved robotics They're doing 65% success on very simple tasks. The research is good. This article however misses a lot of other work in the literature. I would recommend you don't read it as an authoritative source. |
|