|
|
|
|
|
by nanoamp
2347 days ago
|
|
I can see the use-case and potential for ML in exfiltrating tables, but I'd be worried about the potential for decision-making mistakes in environments the author identifies, such as finance. The example of TableNet using deep learning for table extraction on top of tesseract for OCR means two layers of ML, either of which could individually introduce pathologies without human oversight. It reminds me of the photocopier that changed numbers for you - https://www.theregister.co.uk/2013/08/06/xerox_copier_flaw_m... If an ML engine was trained to be able to do things like look for totals and sub-totals in numerical tables and flag errors in summation, then that would clearly add more value in parsing for moderation (the use-case described at the end). But that doesn't seem to be something that's yet... on the table. |
|
https://www.microsoft.com/en-us/research/publication/melford...