Hacker News new | ask | show | jobs
by nl 2103 days ago
I've done some work on deep learning approaches - specifically table extraction.

It's not at all obvious how to make this work - there is a lot of human judgement involved in judging what a header is vs what are values, especially with merged header column/row columns.

1 comments

Yeah I remember Abbyy also has an interface to define layouts for this kind of problem. I.E., this thing is a table and here are the headers etc.

Sorry, I was not trying to say deep learning would be a substitute for all such issues, just that new approaches may help a smaller team build those tools more efficiently.

I don't know if Abbyy combines its layout tool with training a model for customers, but it seems like a reasonable thing to build and expose.