Hacker News new | ask | show | jobs
by totalcookie 4007 days ago
If there is no opensource/free software with the same quality, what then? What are you using as an OCR server side system on Linux? I'm sure not good enough to write my own OCR better than Abbyy.
1 comments

As the others in the thread have mentioned. Constrain your problem as a computer-vision one to segment nice pieces of work for Tesseract. Along with some nice training data, and possibly human validation if that's feasible.

All do-able within Linux.

As the parent of this comment thread mentions, Tesseract is not very great for mass usage due to the error rate with Abbyy much better. I would be interested in experience not opinion.