Hacker News new | ask | show | jobs
by jlink 3397 days ago
During the development I compared my results with the ones of pdftotext utility and i obtained more or less similar results. The objective of my code was to have an equivalent tool easily embeddable in any java/android project and to learn more about apache pdfbox.
1 comments

I imagine it's not an easy task guessing about proportionally spaced fonts, overlapping bounding boxes, columns, tables, wrapping, and so forth.
yes, definitely not easy but fortunately pdfbox offers a solid base to start with.