| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ocrcustomserver 3147 days ago

If you're looking for full page OCR, check out ocrmypdf (uses Tesseract).

If you want to extract data out of documents/forms then you need to develop your own solution (I'm doing work in this area) or use expensive packages like ABBYY FlexiCapture.

Images taken from a smartphone (compared to scanned documents) is going to make the problem harder.