| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by brookst 652 days ago
	This. Images passed to LLMs are typically downsampled to something like 512x512 because that’s perfectly good for feature extraction. Getting text would mean very large images so the text is still readable.