Hacker News new | ask | show | jobs
by zo1 4007 days ago
And they limit you to scanned pages per year on the corporate/server offerings. Doubly-so for all their dev/api license options.

And the kicker: You can't buy those licenses from them directly. They put you in contact with some randome monopoly local distributor that usually has mandatory "training" charges.

Messy, and I'm planning on staying away from that with a ten-foot pole.

1 comments

If there is no opensource/free software with the same quality, what then? What are you using as an OCR server side system on Linux? I'm sure not good enough to write my own OCR better than Abbyy.
As the others in the thread have mentioned. Constrain your problem as a computer-vision one to segment nice pieces of work for Tesseract. Along with some nice training data, and possibly human validation if that's feasible.

All do-able within Linux.

As the parent of this comment thread mentions, Tesseract is not very great for mass usage due to the error rate with Abbyy much better. I would be interested in experience not opinion.