I find the idea of learning from simulated data so unintuitive. How can you radically improve your model with just your model? I take it people do it, so it must work, but i just don’t understand it at all.
Well there's a world simulation model and then the driving model.
You can imagine improving i.e. a specialized math model (problem in, theorem out) with a normal LLM that knows lots of problems and theorems generally.
I think people are skipping over the fact that Google has had cars driving around taking photos for 20 years. I imagine that was used to build the world model in the first place.
You can imagine improving i.e. a specialized math model (problem in, theorem out) with a normal LLM that knows lots of problems and theorems generally.