Hacker News new | ask | show | jobs
by udayrddy 2304 days ago
same here. I mostly deal with text analytics, while the text PDFs do not create much issues, unless a crazy font is used, and the 2 column pages are a nightmare.

In case you are looking for an API to extract structure rich content like tables from PDFs or images, look into this https://extracttable.com (p.s. I contributed to it)