Hacker News new | ask | show | jobs
by PaulHoule 1203 days ago
I’d look though at systems like CLIP and Stable Diffusion that are able to map between the language domain and images, as well as music, speech, etc. “Riding a bike” can be seen as a sequence modeling problem too because it is a matter of firing muscle fibers in a certain way and it is a research area to make language-controlled robots that do just that.