| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by htrp 480 days ago
	VLM's can't replace ocr one to one.. most hosted multimodal models seem to have a classical OCR (tesseract-based) step in their inference loop