| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jeffreportmill1 1582 days ago
	Off topic, but man is that document hard to use as a reference. Ironically, I wish they would publish it as HTML broken down by chapter and section. (I have used that document a lot to write a custom PDF generator and parser in Java, using a downloaded copy)

3 comments

fivea 1582 days ago

> Ironically, I wish they would publish it as HTML broken down by chapter and section.

I wish there was an EPUB version of the document. Do PDFs support reflowable content?

link

HWR_14 1582 days ago

I believe one of the selling points of PDFs was the absolute lack of reflowing content.

link

hunter2_ 1582 days ago

Right, as the point is to represent a physical document, paper and ink (or canvas, toner, whatever -- stuff that doesn't reflow).

Why anyone would use such a format for these situations, where the audience definitely cares way more about consuming it on an electronic device than printing it out, is... mind-boggling.

Of course, AI+ML to the rescue: Liquid Mode [0].

> Files are processed in our secure data servers and immediately deleted from our servers after the experience is generated.

[0] https://www.adobe.com/devnet-docs/acrobat/android/en/lmode.h...

link

HWR_14 1582 days ago

I've found people being precise about the flow of equations and text intermixed can be easier to read than reflowing content. Other than that, not so much.

Edit: Non-reflowing content also works well if you need to refer people to page numbers and paragraphs.

I look forward to playing with liquidmode at some point soon.

link

hunter2_ 1581 days ago

CSS flow control and specifying an `id` attribute value as a URL fragment would be my solutions to those particular concerns, if it weren't the case that our context here is capturing from software that offers printing but doesn't offer exporting to HTML very well. I think the solution might be "bring it to a good web dev and have a solid punch list."

link

compressedgas 1582 days ago

A PDF can be reflowed without reconstructive processing only if a PDF was generated as a Tagged PDF [1] and if the viewer supports reflowing.

[1]: Essentially a PDF with its own EPUB inside it, but unlike just having an attached EPUB, there is a map between the page layout of the PDF and the tags.

There are implementations of reconstructive reflowing that infer the layout block structure and reading order and can reflow a two column paper into a single column.

link

zozbot234 1582 days ago

PDFs can support tables of contents with labeled chapters and sections. Not sure if the feature is standardized, but it's there.

link

steerablesafe 1582 days ago

The specification does have a hierarchical outline, and you can click on cross references too. Of course navigation can still be cumbersome, linking to chapters can also be awkward (tip: right click on outline element and copy link works in Firefox).

There are some problems of the spec though, and navigation is not the most pressing one. The spec is huge, support for less used parts is spotty in various PDF readers. It also has inaccuracies (not corrected in errata) and underspecified parts.

link

layer8 1582 days ago

> hard to use as a reference

How so? I frequently reference specific sections, tables or pages of the spec at work.

link