| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ilaksh 512 days ago
	I think that this is the obvious path to more robust models -- grounding language on video.