Hacker News new | ask | show | jobs
by jyunderwood 3743 days ago
At work I replaced a [Tesseract](https://github.com/tesseract-ocr) pipeline with some scripts around the Cloud Vision API. I've been pleased with the speed and accuracy so far considering the low cost and light setup.

Btw, here is a Ruby script that will take an API key and image URL and return the text:

https://gist.github.com/jyunderwood/46b601578d9522c0e9ab

1 comments

Did you see a significant accuracy increase over using tesseract?
Personally I have seen a very significant increase in accuracy. In particular with "real life" scenes, tesseract has a hard time.
The accuracy is about the same. We process store circular images which are actually pretty easy to OCR. It helps that we have large images to start with and are converted to grayscale and then edge sharpened in imagemagick before being sent to the OCR process.