| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by beklein 815 days ago
	More like a cousin of LLMs are Vision-Language-Action (VLA) models like RT-2 [1]. Additionally to text and vision data they also include data from robot actions as "another language" as tokens to output movement actions for robots. [1]: https://robotics-transformer2.github.io