| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Geee 360 days ago
	Vision-language-action models seem to be the broad category for the best approach, which basically combines a large vision-language model with robotic actions. For example, see https://www.physicalintelligence.company/blog/pi0