Hacker News new | ask | show | jobs
by unityByFreedom 2103 days ago
Also quite expensive since it is at the head of the pack IIRC. There is probably some value in making a competitor with new deep learning techniques provided you have a sufficiently diverse training set. It would take years to build tho.
1 comments

I've done some work on deep learning approaches - specifically table extraction.

It's not at all obvious how to make this work - there is a lot of human judgement involved in judging what a header is vs what are values, especially with merged header column/row columns.

Yeah I remember Abbyy also has an interface to define layouts for this kind of problem. I.E., this thing is a table and here are the headers etc.

Sorry, I was not trying to say deep learning would be a substitute for all such issues, just that new approaches may help a smaller team build those tools more efficiently.

I don't know if Abbyy combines its layout tool with training a model for customers, but it seems like a reasonable thing to build and expose.