Hacker News new | ask | show | jobs
by kamalfariz 2219 days ago
OCR techniques are general purpose in trying to map any conceivable text-looking shapes into actual text. Accuracy can vary wildly but the good ones will match against plausible words to eliminate low quality guesses.

Is there an accuracy optimization to be found if I can pre-train the OCR engine to look for a limited set of words instead of the entire dictionary- and printable character space?

The use case I have is OCRing shipping labels for packages that arrive at an office. The set of plausible matches is incredibly small as it is the set of employee names that work in said office.

Further optimizations include reducing the problem space by only considering computer printed glyphs and not bothering with handwritten labels, and the insight that the distribution of packages follow a power law where a disproportionately small group of people receive the largest number of packages.

The end goal is to perform this entirely on device, with low latency and high accuracy.

1 comments

Consider looking into language models such as KenLM. It is used by ASR models like wav2letter and DeepSpeech to correct speech-to-text transcripts