|
|
|
|
|
by shoshin23
3111 days ago
|
|
Image text recognition is a major problem we're trying to solve in our startup. I would love to be pointed to some SOTA research in this space. Hard to find anything by Googling about it. As far as our experience goes, Cloud Vision API is a killer option compared to both AWS and MSFT. It's pricier than AWS though and is slower. MSFT is terrible in both price and speed. |
|
Here are some current state-of-the-art papers + code where available about detection:
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection https://arxiv.org/abs/1709.03272
EAST: An Efficient and Accurate Scene Text Detector https://arxiv.org/abs/1704.03155 https://github.com/argman/EAST
Detecting Oriented Text in Natural Images by Linking Segments https://arxiv.org/abs/1703.06520 https://github.com/dengdan/seglink
Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 https://github.com/mjq11302010044/RRPN
And for recognition:
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition https://arxiv.org/abs/1507.05717 https://github.com/bgshih/crnn
Robust Scene Text Recognition with Automatic Rectification https://arxiv.org/abs/1603.03915