|
|
|
|
|
by visarga
2374 days ago
|
|
It's probably a detection neural net (such as Faster R-CNN) for putting bounding boxes around words, which is complicated by the fact that it can predict polygons in any orientation, followed by a LSTM-CRF layer for text transcription. It's a good generalist OCR but often has sub-par results for specific types of input. It tens to often miss single letters surrounded by whitespace. |
|