|
|
|
|
|
by ACCount37
41 days ago
|
|
They can be base models for a bunch of things. Turning text-conditioned video generation models into robotics VLAs is a fun exercise. This one is probably too small to be useful for that, and not diverse enough? But I could be wrong. |
|