Hacker News new | ask | show | jobs
by jxf 976 days ago
> There are two main mediums for digital documents: PDFs and web pages. PDFs were designed to mirror physical documents, so they impose the real-world constraints of paper: page breaks, fixed width, and immutable styling. Web pages, by contrast, provides an essential dynamism. Web pages are undeniably the future of digital documents.

I actually don't agree with this. I think _not_ having "essential dynamism" where it's not needed is actually a feature, not a bug.

15 comments

> I actually don't agree with this. I think _not_ having "essential dynamism" where it's not needed is actually a feature, not a bug.

I couldn't agree more. Dynamism is great for web _apps_, but it's the last thing I want in a document

OK but then what is the thing in the middle, between documents and applications?

There are fantastic, beautiful interactive experiences (for lack of a better word) that are obviously not documents (they can't be represented on paper, there is code running) but they're not really applications, either (they are fully offline, self-contained, state that's only evident on the page).

But they are, universally: dynamic.

Examples: - https://ciechanow.ski/mechanical-watch/ - http://worrydream.com/TenBrighterIdeas/ - even : https://neal.fun/space-elevator/

This is what I think the future of textbooks and presentations should be. But I think part of the problem is that not only do we not have tools geared toward creating them, we don't even have a name for these things. If we say "document", they flunk the pdf test. If we say "web application", they are lumped into the same lumbering category as office docs and enterprise software.

Maybe Nota is a step in the right direction. But it'd be an even better step if it didn't call itself "21st century documents", if for no other reason than to defend against the valid criticism you levied against it.

When I was a kid, we called it "multimedia". Encarta being the example that comes to mind. My school library had some other interactive encyclopedia that came on a 6–CD-ROM magazine changer (the Pioneer style, if you remember music CD changers). I remember being absolutely flabbergasted at the sheer amount of megabytes sitting in that unassuming disc magazine.

That term is really quaint these days, and it doesn't fully capture what you're talking about. IIRC, it was more prepackaged animations, photos, video, and music; rather than "dynamic code running on paper"

"Dynagraph"? Sorry, that's lame, but there's my entry.

Interactive media?
Some of these interactive simulations for learning are called “Explorable Explanations” [1], coined in 2011 by Bret Victor, the author of your second link (worrydream.com), which also talks about “Explorable Explanations”.

For more examples, see [2].

Very much agree that this is what explanations and presentations should be in the modern age. I think a documentation language (what Nota and Typst aim to be) is still needed in this age of Large Language Models, when the ideas are more complex than those expressible by natural languages.

[1]: https://en.wikipedia.org/wiki/Explorable_explanation

[2]: https://explorabl.es/

Ha, I was totally thinking about the watch blog too!

I think that one is rather a special case. You could print it out with the first still of each bit and probably get out 90% of the content/context, and modify document to print out a few stills from each interactive part and get the same intellectual content. In it's case, the interactivity is superfluous, it's there to spark joy. It's really an example of a document with outstanding figures/visual aid - the interactivity is just a bonus.

Similarly, the space elevator could be a picture book/pdf. The interactive bits there also spark joy.

The tenbrighterideas page mostly annoyed me. It's just bastardizing structure to be "interactive", which is to say most of the information is hidden away behind a bunch of clicks and it could have been a document, with one page dedicated to each idea.

I strongly disagree with the idea that the interactive elements are only there to "spark joy" - in both of the cases you mentioned, the interactive elements are pretty fundamental. Their purpose is to let you get somewhat hands-on with the concepts the text is discussing - to allow you to take apart the watch yourself, at your own pace, and understand what all the pieces are doing.

I'm sure you can convey the same information in text format (like you say, you could just print out the page), but these particular sites would be a lot weaker, because part of their explanatory power is the interactivity.

The original quote was that dynamism was "the last thing I want in a document", and I think these interactive diagrams and explainers directly show how useful dynamism can be in conveying information.

That's not to say that all dynamism is good - I don't usually want you to use Javascript to just load a new page, my browser can do that just fine - but every medium can be abused. That doesn't mean that medium is bad!

I'm not saying they're only to spark joy, but to me at least, their contribution is mostly in that category. I didn't really find it central to the content in any of the examples.

To my point it's "the last thing I want in a document", I stand by it. What I mean is I should be able to print it and really lose nothing central to the content. Yes, digital offers features which may enrich the experience/use/navigation, but at the same time there's questions of accessibility and ease of parsing. If I _have_ to interact with things to get the information//content out, it's effectively a web app, and not a document, and if done wrong it actively interferes with my ability to absorb the information. IMO the brighter ideas page firmly falls under that last point.

The watch page is wonderful, and the visuals and interactivity is done masterfully, in such a way it's obtrusive and not _required_ to understand the document.

I'd classify the elevator page as a web app, but there's really nothing keeping it from being a document/children's book.

And I just really, really, did not like the brighter ideas page. I think the content is good, but the execution got in the way instead of adding to the experience.

I guess what I don't quite understand in your comment is why these two categories of "web app" and "document" are so fundamental, in the sense that we can divide all web content into either document or web app. Is this a useful categorisation, or is it just applying concepts from existing forms of media to a medium where those concepts don't really fit that well?

For example, with the watch page, if we're defining document by printability, it makes a poor document - while you can print it out, what you'll end up with is a document with lots of static pictures and a bunch of (now useless) text referencing how you can move the pictures to see different things. If I wanted a fully printable document, I'd find a different one that was written with the expectation of being printed - maybe a book about watches, or an entirely static page. That will suit the print medium significantly better than this interactive page.

It makes me think a bit of a science museum, in the sense that most science museums will have a lot of text written around that explains all the concepts they want to discuss - this is how a pivot works, that's what a cow's digestive system looks like, here's a description of a space ship or whatever. And you could collect all this text and turn it into a book, and it would be an informative book that you could read and thereby learn something.

But the value of a museum is that it doesn't just have to be text. You can put a pivot into your visitors' hands; you can show food moving between different parts of a cow's digestive system in real time; you can show genuine pieces of real rockets and discuss what journeys they've been on. The medium allows you a huge amount of extra freedom, and a good museum curator will use that freedom - wisely - to produce an experience that allows visitors to get more insight than they would have if they'd just "printed out" the museum's text and read it all.

That's not to say that written text doesn't have its own advantages - you don't need to visit a book every time you want to get information from it, for example! What's key is that by tailoring the content to the medium that we're employing, we can produce a better result than by trying to apply the norms of a different medium. If we'd built our museum like a book, it would have been a bad museum.

I think a similar principle applies to the web - it comes with its own set of tools and features that differentiate it from books or print media. Some of those are fairly subtle - the ability to reference different pages and sites using hyperlinks, for example - but part of that is the interactivity. And not all sites need interactivity at all times, and the best sites use interactivity only when it adds to the experience (just like the best museums - not the ones that surround you with flashing lights and noise just to distract your attention). But the use of interactivity can elevate a simple text far beyond what a print document can do. I think the watch site is a really good example of what happens when you don't see web pages as "just" documents, and rather embrace their unique qualities.

That's why I don't think it's always helpful to make this category distinction between "document" and "web app", particularly when "document" just means "uses the norms of a different form of media", because the whole point of the web is that it a new media form, with its own features and capabilities.

The example academic paper has puts the definition of variables into a tooltip when you click on them. That's a nice feature for academic papers.
Do you object to the interactive Vega widget? https://nota-lang.org/reference.html#def-section-1.1

Reputable scientific journals now post videos online alongside their articles. Interactivity is even better for understanding. But permanence is an issue I suppose.

I'm not saying it's all bad. Digital augmentation can be handy (love the hell out of document search), but honestly their examples aren't compelling.

I think accessibility is a major issue, where the text doesn't make as much sense without the interactive bits, and often the text itself isn't substantial enough to be stand alone.

But also, it's starting to cross the line between web app and document. I can print out a pdf and I just lose peripheral QoL benefits like document search. However, if I try to print a web app I usually lose a lot of the content/context.

Edit: as far as supplemental material goes, I'm all for it. People learn differently, so video, audio, web app, whatever are all great supplemental materials, but a good document should be able to stand by itself.

There are some genuine benefits that come along with. I mean, if I could have auto expand inline footnotes/references in a document, I'd be a happy camper.
I think some basic dynamism is still necessary to make reading pdf on small screens comfortable (e.g. mobile). When on mobile, I vastly prefer reading webpages over pdf because most reasonable webpage should be able to fit mobile screen.
Sadly, if you have an iPhone I don't think you can easily read Nota docs currently. The article that introduces Nota [1] has only been tested on Chrome. I tried multiple browsers on iOS to no avail (likely since they all use the same underlying rendering engine).

[1] https://willcrichton.net/nota/

All you need is re-flowable text for this, right? That's generally not considered dynamic.
If you follow the comment trail back up, in the context of this conversation, static means PDF, hard constraints of physical page layouts; dynamic means HTML, digital, no hard layout constraints.
I had read all those comments. What I'm saying that is that re-flowable text is just in a very different class than web pages that auto-translate the text, have animations, or run arbitrary code.
Yes, that's what dynamic means in a frontend dev context. This is a typesetting and document layout context.
No, Nota is not just typesetting and document layout. Things like dynamical code examples, auto-translation, advanced tool-tips, and reader-customizable notation are intended to set it apart from predecessors. In other words, it is different precisely because it does things that would be considered dynamic to a front-end dev.
PDFs constrain information by limiting the way it can be reproduced digitally. There is no such thing as "just" copy/pasting from a PDF - weird errors abound. The text has to be extensively reformatted or run through special software.

A format is needed that encodes information visually and digitally. The digital layer doesn't have to be visible by default, just accessible when needed.

>I actually don't agree with this. I think _not_ having "essential dynamism" where it's not needed is actually a feature, not a bug.

Yeah.

To author's surprise, Adobe's PDF spec supports JavaScript execution[1]. And interactive 3D graphics [2][3]. Not to mention, audio and video [4].

And "Liquid Mode" for responsive-layout PDF documents [5].

Of course, these "features" were considered bugs by the ISO PDF/A spec (archival, i.e. future-proof), so they were all stripped out [6].

The point being: sometimes a document should be a document.

As for science papers: LaTeX is written by humans, for humans. Custom latex commands and packages allow one to write a plaintext document that is as easily read as the paper it generates.

Which is great for accessibility, among other things.

[1] https://helpx.adobe.com/acrobat/using/applying-actions-scrip...

[2] https://www.youtube.com/watch?v=PKfyFt3zT5A

[3] https://www.youtube.com/watch?v=vW5-1LVtd9U

[4] https://helpx.adobe.com/acrobat/using/rich-media.html

[5] https://www.adobe.com/acrobat/hub/what-is-adobe-liquid-mode....

[6] https://en.wikipedia.org/wiki/PDF/A

The static issue also permeates to webpages and other formats though. Although this is now just yet another competing method for documentation or creation, the restrictions caused by using TeX or LaTeX over more dynamic approaches are not insignificant.
Check out the example of the PLDI paper. There are popups for symbols inside large, dense equations. That can really help.

I think that there's interesting open space for dynamic documents, but you need some good examples of people using it with taste.

You mean this one? https://nota-lang.org/examples/infoflow-paper/standalone/

One unfortunate problem is that nobody bothered setting the measure for legibility. On my display the text block is far far too wide. Cf. https://en.wikipedia.org/wiki/Line_length (while we're talking about typography, the fonts for body copy and code are mismatched in size in a distracting way)

As far as formulas/notation is concerned, the notation used in this paper is targeted only at experts in theoretical computer science, approximately the level of advanced grad students or above, who also happen to be pretty familiar with Rust and C++. The gimmicky popups are probably not meaningfully helpful for such an audience, and in my opinion don't really make the notation any more accessible to people without the extremely steep prerequisite expertise (e.g. I don't think this paper is going to be at all accessible to the vast majority of working programmers or computer science undergraduate students).

If you really want to make the paper more accessible, it would be better to focus on reducing the reliance on formulas, reducing the amount of jargon involved, and explaining the concepts and techniques using plain English targeted at a broader audience, rather than trying to add extra colors, click targets, or popups. (A research paper may alternately want to just target experts; that can also be fine. Even for experts this paper is pretty dense though.)

They were helpful for me, an ex-grad-student who read some type theory years ago. I think anyone breaking into the topic would appreciate that.

Requiring authors to publish two versions to make it accessible, when the motivated reader just needs a little comfort, is too high a bar. Let them write densely for their primary audience (and this pass peer review) and still give affordances for everyone else.

As for width, font, etc, a stylesheet can fix that. I'm assuming they allow stylesheets to format for each venue appropriately.

>There are popups for symbols inside large, dense equations. That can really help.

That can also really mess with screen readers and other accessibility features.

There's a reason PDF/A spec forbids scripts in any PDF documents that are expected to be available in the future.

[1] https://en.wikipedia.org/wiki/PDF/A

This is a good point. Can a similar markup mechanism help make accessibility easier? Equations are compact to the eye but can be really unpleasant read out loud.
It is needed. I hate reading pdfs on my 24" or 32" monitor. I hate reading them on my Phone. The main thing that my father complained about when switching from Blackberry to iPhone, was missing PDF reflow feature. Basically the only screen where I find pdfs comfortable to read, is on the 12.9" iPad, and only if the author has the same font-size preferences as me.
That is the main flaw of the PDF file format. It is by design.

If we want documents that work well on all screens, then we use EPUB.

I don't want dynsmism, I want reprodicity.
So you want static margins that not even adjust to mobile?
> Web pages are undeniably the future of digital documents.

It used to be. But not anymore. Web browser + js are too bloat for any serious documentation usage.

I agree with you. Another document is Word document (or whatever the standard is officially called), which is an editable page by page document.

In theory a PDF is a static document that should display the same in any PDF reader.

Reminds me of https://lab6.com
Reading smresearch papers on BART is really annoying right now so I’m a fan of this project.
Yeah; I mean do knock down the page breaks, but don't go overboard with the dynamism.
And I don't agree with you.

Every time I can choose between PDF and EPUB, I choose EPUB.

If it's not needed, it's not essential