| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nhirschfeld 495 days ago

Thanks, I'll check these links.

In my tests I found tesseract quite good for regular text documents. For other kinds of texts it's not great.

As for using models - there are some good small language models as well, and of course LLMs.

I sorta feel though that if one needs complex OCR, or a vision model for layout, one should opt for either a commercial solution that abstracts the deployment and GPU management, or bake ones own system.

For most use cases involving text documents though, my subjective opinion is that tesseract is sufficient.