| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by int_19h 132 days ago
	This is a big reason why SOTA models are trained multimodal these days. Even when you're using them for text, the knowledge they gain from images and video improves their world models.