Ask HN: Is there no good OCR available?

Y	Hacker News new \| ask \| show \| jobs

	Ask HN: Is there no good OCR available?
	2 points by leokster 671 days ago
	I'm wondering if tools like Tesseract are still the open-source (and offline) gold standard. There are, in the meantime, document intelligence services from all large cloud providers, but there is still not really a usable AI model that is capable of doing good OCR (image, not necessarily scans -> text). Do you know any active projects or resources in that field?

2 comments

latexr 671 days ago

Apple’s operating systems have been doing stellar OCR since 2019. When the feature was announced I was uninterested, but now I’m surprised how much I use it. It works without any extra work in Preview, Safari, and other apps. You can call it programatically via Shortcuts or the Vision APIs.

https://developer.apple.com/documentation/vision/recognizing...

link

solardev 671 days ago

(Edit: Nevermind, sorry. I misread your question. I think you're mainly interested in free offline apps.)

Does it have to be an "AI" model in the modern usage of it (LLMs, etc.?)

In the past, I found Google's Cloud Vision API to be pretty good for this sort of thing (images in text): https://cloud.google.com/vision?hl=en#demo

AFAIK Tesseract was never state of the art, it was just free and cheap. The commercial offerings (in my limited experience) were usually much more accurate.

link

verdverm 671 days ago

Second Google's offering which can reasonably read my chicken scratch

link