Hacker News new | ask | show | jobs
by disattention 1027 days ago
Why not use existing OCR/document extraction tools [0]? There are a number of options, and even a custom implementation is probably a reasonable side project given some standardized structure.

[0]: https://rossum.ai/lp/data-extraction

1 comments

The structure isn't standardized --- it's a random check design placed on top of an invoice which may be printed from a wide variety of printers at some random scale, and possibly photocopied multiple times.