|
|
|
|
|
by PaulHoule
1203 days ago
|
|
I’d look though at systems like CLIP and Stable Diffusion that are able to map between the language domain and images, as well as music, speech, etc. “Riding a bike” can be seen as a sequence modeling problem too because it is a matter of firing muscle fibers in a certain way and it is a research area to make language-controlled robots that do just that. |
|