Hacker News new | ask | show | jobs
by wolverine876 1546 days ago
> LaTex is one of the worst offenders when it comes to producing mangled text layers. Multi column text is often stored with both columns interleaving, or not at all. Verbatim is mangled, formulas are a hot mess. The text order between paragraphs is not preserved.

That's interesting. I must not deal with many LaTeX-based PDFs. The text in electronically-born PDFs I use is usually nearly flawless, with the exceptions of the bizarre extra space inserted between some words, and the challenge of hyphenated words on lines that no longer wrap in that spot.

> All those extra features you mention make your "still readable in 50 years" requirement go out the window pretty quickly.

I don't have your expertise, but I've heard a different story from librarians regarding PDF and particularly PDF/A.