Hacker News new | ask | show | jobs
by MrBuddyCasino 4287 days ago
Thats the big question. Tesseract is pretty good, though quite slow I must say.
1 comments

It depends on what is being scanned. Say you have a perfectly formatted image, directly taken from a scanner, it's a pretty darn quick process.

But from my experience, what adds to the slowness is pre-processing the image to make it suitable for OCR, especially tesseract. I still haven't found the magic combination of filters because every image is different, especially if your source them from users camera phones.