Hacker News new | ask | show | jobs
by Onavo 616 days ago
The key difference between print systems and web tech is responsiveness. Anything print related is primarily designed with dead tree format in mind, so the layout won't change, and you don't have to worry about text reflowing after editing.

It's also why LaTeX/PDF to HTML converters are so difficult to build, because the underlying engine has no semantic information about the structure (this may be changing with LLMs and multimodal setups).

3 comments

One could ask if responsiveness is relevant for documents.

You could simply use a static layout for your html, and then add borders or zoom (just like in a pdf viewer).

Then you'd have the editability, accessibility and performance of html, with the same responsiveness as a pdf (none).

I've never really given this much thought, but html could reallly become the standard file format for documents.

> The key difference between print systems and web tech is responsiveness.

True, but... we were very good at building unresponsive websites in the early 2000s. Can't we just return to tradition and disable a lot of the responsive behaviour that we've layered onto HTML with an off-the-shelf stylesheet? Hardcode some width properties, ya know? (This is not a rhetorical question, genuinely curious).

You can trivially define a CSS stylesheet that eg. hides all the interactive elements like INPUTs and FORMs, or renders <A> tags like plain text.

But "H" in "HTML" is for "Hyper(text)", which really talks about the interactivity. And then you get a really bad language for typesetting that simply lacks a gazillion features of true typesetting systems like TeX or even Typst.

Then you might as well just use PDF.js and render the PDF in its entirety.
Regarding LaTeX to HTML, I have had some success with pandoc, e.g. https://ykonstant1.github.io/power-draft.html

It is much trickier if you are using tikz heavily, but it still doable.