Hacker News new | ask | show | jobs
by Livven 3099 days ago
If you're trying to handle text "in the wild" and not scanned documents, the keyword is "scene text". Most papers are focused on either detection/localization, i.e. finding the location of text, or recognition, i.e. recognizing the actual content given a cropped text image.

Here are some current state-of-the-art papers + code where available about detection:

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection https://arxiv.org/abs/1709.03272

EAST: An Efficient and Accurate Scene Text Detector https://arxiv.org/abs/1704.03155 https://github.com/argman/EAST

Detecting Oriented Text in Natural Images by Linking Segments https://arxiv.org/abs/1703.06520 https://github.com/dengdan/seglink

Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 https://github.com/mjq11302010044/RRPN

And for recognition:

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition https://arxiv.org/abs/1507.05717 https://github.com/bgshih/crnn

Robust Scene Text Recognition with Automatic Rectification https://arxiv.org/abs/1603.03915

1 comments

I'd also add on the subject the following whitepaper from MS - http://digital.cs.usu.edu/~vkulyukin/vkweb/teaching/cs7900/P...
Note that this paper is from 2010 and thus, while quite influential for its time far from the current state-of-the-art. The stroke width transform method that it introduced is simply not as good as current deep learning-based methods.

If you want to get a (slightly out of date but what can you do, the field is moving very fast) overview see this survey from 2016:

Scene Text Detection and Recognition: Recent Advances and Future Trends http://mclab.eic.hust.edu.cn/UpLoadFiles/Papers/FCS_TextSurv...