| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stealthcat 700 days ago
	Marketed as “multimodal” but actually texts and images. Multimodal dataset should be multimedia: text, audio, images, video, and optionally more like sensor readings and robot actions.