Hacker News new | ask | show | jobs
by hbornfree 4007 days ago
As others pointed out, Tesseract with OpenCV (for identifying and cropping the text region) is quite effective. On top of that, Tesseract is fully trainable with custom fonts.

In our use case, we've mostly had to deal with handwritten text and that's where none of them really did well. Your next best bet would be to use HoG(Histogram of oriented gradients) along with SVMs. OpenCV has really good implementations of both.

Even then, we've had to write extra heuristics to disambiguate between 2 and z and s and 5 etc. That was too much work and a lot of if-else. We're currently putting in our efforts on CNNs(Convolutional Neural Networks). As a start, you can look at Torch or Caffe.

1 comments

In my experience, CNNs offer the best performance. It is also easy to treat as a black box with many tunable parameters.

But the existing frameworks are mostly bad as an engineering product. Caffe, for example, calls `exit` every time an error occurs, including recoverable ones like files which do not exist.

Yes, even we found Caffe to be overwhelming. This library seems promising considering there are no dependencies and is quite portable to Android or iOS. https://github.com/nyanp/tiny-cnn