Hacker News new | ask | show | jobs
by fmbb 760 days ago
OCR has been a solved problem for years. Long before LLMs started being hyped.

At least from typewritten documents that you did not torch or shred etc.

1 comments

No it hasn't. Just 1.5 years ago I tried all the latest OCR tools, including AWS, GCP and Azure services, and none of them could consistently and reliably read a receipt printed at a store.
Receipts are hard.

- cheap paper

- cheap ink

- misprints

- abbreviations

- every store does it differently

Yes. Which makes OCR not a solved problem.
OCR is merely step one.

Interpreting recognized characters is another matter.