Hacker News new | ask | show | jobs
by topogios 3742 days ago
PDFBox is quite good at extracting text from pdfs. When I use PDFBox, it is for extraction.

Maybe this has changed with newer versions of PDFBox, but 5+ years ago, the internet wisdom was to use PDFBox for extraction and something else, like a version of iText that suited your license needs, for generation.

As much as I like LaTeX, if you have made no prior time investment in typesetting with it, it is not trivial to produce custom good-looking output with it.

Have you tried using (La)TeX in a real world project? Would be cool to hear from someone on whether compilation time is an issue. Some TeX packages have a quite severe impact on performance.