Hacker News new | ask | show | jobs
by shrimpx 906 days ago
Since the article doesn't link to any example HTML article, here's a random link:

https://browse.arxiv.org/html/2312.12451v1

It's cool that it has a dark mode. Didn't see a toggle but renders in the system mode.

Overall will make arXiv a lot more accessible on mobile.

1 comments

And here's the PDF of the same paper for comparison: https://arxiv.org/pdf/2312.12451.pdf
The contrast is massive. I'm much more likely to read the html version; that PDF is deeply off-putting in some hard to define way. Maybe it's the two columns, or the font, or the fact that the format doesn't adjust to fit different screen sizes.
This is very interesting, because for me it's just the opposite. In particular the two column layout is just more readable and approachable for me. The PDF version also allows for a presentation just as the authors intended. I guess it's good that they offer both now.
Do you work extensively with LaTeX?

Two columns is good, albeit annoying on mobile. But the font. The typeface kills me, and almost every LaTeX-generated document sports it.

Hilariously, I would probably tolerate the HTML version a lot better if it had the font from the PDF (and FWIW, the answer for me is "no: I don't work with LaTeX at all... I just read a lot of papers").
If you disable the font rule

  :root, [data-theme=light] {
    /* --text-font-family: "freight-sans-pro";
  }
it switches to "Noto Serif" that is way easier on the eyes.
https://github.com/neilpanchal/spinzero-jupyter-theme /fonts/{cmu-text,cmu-mono} :

> "Computer Modern" is used for body text to give it a professional/academic look

Hating on Computer Modern (ok, probably now Latin Modern) is something close to blasphemy.
Computer Modern was not designed for easy viewing on screens (think about the screens Knuth would have been using in 1977), it was designed for printing in books.
I hate Computer Modern, and I'm not even particularly fussy about typefaces.
What device and app are you using to read the document?
The authors don’t format the pdf, the editor does. Authors probably sent a double spaced word document with figures and tables on another file.
Not on arXiv (unless I'm much mistaken), which is a preprint server, not a conventional journal.

arXiv accepts various flavors of TeX, or PDFs not produced by TeX [0], and automatically produces PDFs and HTML where possible (e.g. if TeX is submitted). In the case of the example paper under discussion, the authors submitted TeX with PDF figures [1], and the PDF version of the paper was produced by arXiv. The formatting was mainly set by using REVTeX, which is a set of macros for LaTeX intended for American Physical Society journals.

[0] https://info.arxiv.org/help/submit/index.html#formats-for-te... [1] https://arxiv.org/format/2312.12451

FWIW, I recently learned that it is also possible to produce nice PDF papers with GNU roff (groff), have a look at this example: https://github.com/SudarsonNantha/LinuxConfigs/blob/master/....
You typically send a .tar.gz of tex files (and, figures, .bbl, etc.) to the journal. And then you typically upload something very similar to the arxiv (I have an arxivify Makefile target for for my papers that handles some arxiv idiosyncrasies like requiring all figures to be in the same folder as the .tex file, and it also clears all the comments; sometimes you can find amusing things in source file comments for some papers).

Some fields may use Word files, but in most of physics you would get laughed at...

It is true that most journals will typically reformat your .tex in a different way than is displayed on the arXiv.

In computer science, the usual case is that the author fully formats the paper.
Not only is this wrong about physics/astronomy, I regularly use the arxiv version because the typography is better (e.g. in the published paper an equation is split with part of the equation being at the bottom of one column, and the top of the next, whereas the equation is on one line in the arxiv version).
You are very confidently wrong.

In the arxiv you use latex and do everything yourself. There is no editor.

You are completely wrong. ArXiv doesn't work like that.
For what it's worth, two column layouts are very common in the physical sciences, or at least in physics which I'm more familliar with. I have a feeling that the reason is at least partly to save page space when using displayed math (e.g. equations that are formatted in a break between blocks of text), which use the full text width (i.e. the width of one column) to display what may be much less than half a page wide.
It makes sense - for paper. But pixels are infinite - HTML is far better for screen display, which is how people read things nowadays.

The extra column next to the one I'm reading introduces a lot of visual noise, and the content is hard enough as it is. I'm sure physicists have all gotten used to it, but it certainly trips me up.

> The extra column next to the one I'm reading introduces a lot of visual noise

Papers are generally not read start to finish in one go: there's lots of rereading and jumping back and forth between key parts, and anything that moves them further apart makes this harder.

Ah, that makes more sense. I imagined scientists just reading the whole thing start-to-finish.

I still think a flexible layout is best. If you like multi-columns and have a wide screen, why not display 12 columns next to each other?

With PDF this is not possible. With HTML the content can in principle be sliced and diced how you like it.

I need to scroll up and down a lot more with two-column layout because a single page doesn't fit on my screen in my chosen font size (which is fairly large).

But HTML is so much more flexible, and ideally people can choose how they want it, although at this point it seems that's not (yet) implemented.

I find jumping back and forth is always a pain on computer screens and ebooks by the way, and is the major reason I much prefer print for this type of thing.

Two column is the default in astronomy also.
Definitely the two columns for me. It's super annoying skimming a paper and having to scroll down and back up again in a zig-zag pattern.
I think the consuming device matters. A ipad or computer have much wider screen width. One column layout is too wide for them for average people to scan text lines quickly.

While it looks perfectly fine on a phone. Two columns layout looks terrible on a smartphone, the text is too tiny to read comfortably.

It would probably be even better if you can flip it left and right like a ebook instead of scrolling to allocate the content faster. But current design is good enough IMO. (Compare to reading a pdf on cellphone)

To display two column layout you need a tall screen, now wide. If you display two column layout on a short wide screen, you have to scroll it up and down in zigzag pattern to read one page.
Just zoom the smartphone into one column. Problem solved.
And then you will have to scroll both top bottom and left right, a even worst experience.
If you read a lot of papers in your line of work you will quickly appreciate the two columns and justification.
Only problem is jagoffs like me who need the text to be bigger. On PDFs you now get to experience a horizontal scrollbar. HTML has text reflow and I can set the line length by resizing the window. I'm willing to make a lot of sacrifices for that experience.
Admittedly, I don't read research papers. But with HTML, surely the choice between one or two columns is a checkbox away.
Which checkbox?

I cannot find anything relevant in any of the 3 browsers I use (Vivialdi, Firefox, Chrome). Would really appreciate this option.

A quick search gave some apparently unmaintained browser extensions, and it's it.

No, I'm saying there should be a checkbox. That way, you can switch between two columns formatted like LaTeX and that font they always use, and one column with Helvetica / Arial.
I wonder if perhaps it's a generational thing, I prefer the PDF because it reminds me of printed paper, which is what I used growing up.

(For reference: I am at the end of Gen X, people 3-4 years younger than me are considered Millennials).

Quite so. The font annoys me. This is one of the reasons I hate PDF and why I believe these things should be controlled by the person reading it, not the publisher.

I do not much care what font the auctor finds pleasant to read, but what I find pleasant to read, and this font isn't it, and neither are the colors.

Seconded. I can (will) actually just read referenced papers now instead of hesitating to either get a headache or stay uninformed.

Defaults and UX rule the world. It’s unfortunate that $subj wasn’t a thing for so long and probably scared millions of curious minds from material. It is so important.

It feels quite standard for a paper
defo concur. will read the html version when on mobile from now on.
I prefer the pdf version, mostly. I can annotate it on the side both in print and digitally with my iPad. I can also invert colors in pdf readers to get some kind of “dark mode” easily.

The html version is wasting a lot of space on the right side and the color scheme is awful (dark grey on a brown background, seriously? How is that any better? Edit: disabling dark mode yields a better reading experience wrt color scheme). Also, somehow links to references make another http request and have no backlink?

The html version could make sense if it had more dynamic functionalities: change fonts/line spacing, toggle color schemes, maybe a mini map or some other navigational tool? Also, some kind of support for highlighting and/or annotating?