| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by RandomBookmarks 3266 days ago

In the "better than Tesseract" category is also Microsoft Azure OCR (not as good as Google) and the OCR.space OCR API (also not as good as Google, but 100* times cheaper/free, and supports PDF).

The best - and most expensive - solution is still Abbyy OCR. They provide an SDK than can be used locally.

A new local OCR solution is Anyline.io, but I have not used them yet.

2 comments

anonynamja 3265 days ago

Sorry to hijack this but I have a question about your comment here: https://news.ycombinator.com/item?id=14441748

How did you get Copyfish to play nice with Zhongwen/Perapera? I've tried it with Chrome and Firefox and nothing seems to get them to pick up on the OCR text.

link

beagle3 3265 days ago

I'm trying to read things like street signs, speed limits, store names, from not-necessarily-axis-aligned pictures - so far it seems only Google OCR can do those (and does them quite well). Is Abbyy worth trying for that use?

link

maxerickson 3265 days ago

No API, but mapillary is doing that with machine learning:

http://blog.mapillary.com/product/2017/02/06/towards-global-...

It seems likely that Google is doing something similar.

link

ocrcustomserver 3265 days ago

I can probably help you with that, send me an email.

link