Hacker News new | ask | show | jobs
by jiaweihli 3843 days ago
Have you looked into ocropy[0]?

Here's a nice intro[1] that later talks about how it achieves higher accuracy using an LSTM model[2].

[0] https://github.com/tmbdev/ocropy

[1] http://www.danvk.org/2015/01/09/extracting-text-from-an-imag...

[2] http://www.danvk.org/2015/01/11/training-an-ocropus-ocr-mode...

2 comments

I have not. It sounds interesting but raw and unsuitable for end-users. I hope the quality improves and they can get it packaged up in a way that existing document scanners can plug into easily.
Note that the primary author of ocropy (formerly ocropus) works at Google.