|
|
|
|
|
by exhibitapp
1152 days ago
|
|
I've worked extensively in this space. For those looking for just an OCR solution MSFT's offering "read" is by and far the most accurate. Key-value, table and other information extraction is a much harder problem. Anything that can go wrong in production will. Documents with extra pages, rotated, blacked out, fuzzy. There are many steps that go into making document extraction really e2e. The biggest enterprise users are doing thousand+ of pages a minute and also turn document extraction into a scaling distributed systems problem |
|
[1]: https://www.ibm.com/cloud/blog/exploring-ibms-new-optical-ch...