| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cma 263 days ago
	Some multimodal models may have a hidden captioning step that may take completion tokens, others work on a fully native representation, and some do both I think.