Hacker News new | ask | show | jobs
by Dalewyn 541 days ago
>Agreed on importance of portability and durability.

I think "importance" is understating it, because permanent consistency is practically the only reason we all (still) use PDFs in quite literally every professional environment as a lowest common denominator industrial standard.

PDFs will always render the same, whether on paper or a screen of any size connected to a computer of any configuration. PDFs will almost always open and work given Adobe Reader, which these days is simply embedded in Chrome.

PDFs will almost certainly Just Work(tm), and Just Working(tm) is a god damn virtue in the professional world because time is money and nobody wants to be embarrassed handing out unusable documents.

1 comments

PDFs generally will look close enough to the original intent that they will almost always be usable, but will not always render the same. If nothing else, there are seemingly endless font issues.
In this day and age that seems increasingly like a solved problem to most end users, often a client-side issue or using a very old method of generating a PDF?

Modern PDF supports font embedding of various kinds (legality is left as an exercise to the PDF author) and supports 14 standard font faces which can be specified for compatibility, though more often document authors probably assume a system font is available or embed one.

There are still problems with the format as it foremost focuses on document display rather than document structure or intent, and accessibility support in documents is often rare to non-existent outside of government use cases or maybe Word and the like.

A lot of usability improvements come from clients that make an attempt to parse the PDF to make the format appear smarter. macOS Preview can figure out where columns begin and end for natural text selection, Acrobat routinely generates an accessible version of a document after opening it, including some table detection. Honestly creative interpretation of PDF documents is possibly one of the best use cases of AI that I’ve ever heard of.

While a lot about PDF has changed over the years the basic standard was created to optimize for printing. It’s as if we started with GIF and added support to build interactive websites from GIFs. At its core, a PDF is just a representation of shapes on a page, and we added metadata that would hopefully identify glyphs, accessible alternative content, and smarter text/line selection, but it can fall apart if the PDF author is careless, malicious or didn’t expect certain content. It probably inherits all the weirdness of Unicode and then some, for example.

I would assume these decision tree PDF use a commonly available font. Layout and interpreted outcomes should be the same.