| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jfk13 2063 days ago

Sometimes works well, depending on the structure and content of the PDF. Other times it's hopeless.

Certainly not a general solution. Indeed, there isn't one, because the design of PDF allows far too many things that can't be reliably deciphered back to the source data.

That's why Adobe is throwing all their ML at it, to try and come up with something that guesses near enough right more of the time.