|
|
|
|
|
by AnotherGoodName
382 days ago
|
|
Ok playing with this more there's very subtle differences between sessions. As in there is some hallucination here with certain small differences. I think what's happening is this is AI generated but it is very very overfitted to real world 3D scenes. The AI is almost rendering exactly a real world scene and not much more. They can't travel out of bounds or the model stops working since it's so overfitted to these scenes. The overfitting solves hallucinations but it also makes it almost indistinguishable from pre modelled 3D scenes. |
|
This would explain:
1. How collisions / teleportation work and why they're so rigid (the WM is mimicking hand-implemented scene-bounds logic)
2. Why the scenes are static and, in the case of should-be-dynamic elements like water/people/candles, blurred (the WM is mimicking artifacts from the 3D representation)
3. Why they are confident that "There's no map or explicit 3D representation in the outputs. This is a diffusion model, and video in/out" https://x.com/olivercameron/status/1927852361579647398 (the final product is indeed a diffusion WM trained on videos, they just have a complicated pipeline for getting those training videos)