Hacker News new | ask | show | jobs
by visarga 2374 days ago
It's probably a detection neural net (such as Faster R-CNN) for putting bounding boxes around words, which is complicated by the fact that it can predict polygons in any orientation, followed by a LSTM-CRF layer for text transcription. It's a good generalist OCR but often has sub-par results for specific types of input. It tens to often miss single letters surrounded by whitespace.