Hacker News new | ask | show | jobs
by RandomBookmarks 3266 days ago
In the "better than Tesseract" category is also Microsoft Azure OCR (not as good as Google) and the OCR.space OCR API (also not as good as Google, but 100* times cheaper/free, and supports PDF).

The best - and most expensive - solution is still Abbyy OCR. They provide an SDK than can be used locally.

A new local OCR solution is Anyline.io, but I have not used them yet.

2 comments

Sorry to hijack this but I have a question about your comment here: https://news.ycombinator.com/item?id=14441748

How did you get Copyfish to play nice with Zhongwen/Perapera? I've tried it with Chrome and Firefox and nothing seems to get them to pick up on the OCR text.

I'm trying to read things like street signs, speed limits, store names, from not-necessarily-axis-aligned pictures - so far it seems only Google OCR can do those (and does them quite well). Is Abbyy worth trying for that use?
No API, but mapillary is doing that with machine learning:

http://blog.mapillary.com/product/2017/02/06/towards-global-...

It seems likely that Google is doing something similar.

I can probably help you with that, send me an email.