Hacker News new | ask | show | jobs
by dash2 972 days ago
Can I make a simple point?

As an academic, 99% of my time is spent doing two things:

1. Writing statistical computations using a language like R or python.

2. Writing English text.

The most important thing about a document language is that it should prioritize those things. For example, here's why Rmarkdown/Quarto is better than TeX. A TeX document starts:

    \documentclass[12pt,a4paper]{article}
    \usepackage{amsmath}
    \usepackage{amsthm}
    \usepackage{geometry}
    \usepackage{enumitem}
    \setitemize{noitemsep}
    \usepackage{tabularx}
    \usepackage{setspace}
    \newcolumntype{x}{>{\centering\arraybackslash}X}

    \newtheorem{theo}{Theorem}
    \newtheorem{prop}[theo]{Proposition}
    \newtheorem{lemma}{Lemma}

    \usepackage{fontspec,xunicode}
    \defaultfontfeatures{Mapping=tex-text}
    \setsansfont{TeX Gyre Heros}
A quarto document starts:

    ---
    title: "Natural selection in the Health and Retirement Study"
    author: "XXX"
    abstract: |
      I investigate natural selection on polygenic scores
      in the contemporary US, using the Health and Retirement       
      Study. Results
      partially support the economic theory of fertility as
      an explanation for natural selection: among both white 
      and black respondents,
      scores which correlate negatively (positively) with education are
      selected for (against). Selection coefficients are
      larger among low-income
      and unmarried parents, but not among younger parents or those with less 
      education. I also estimate effect sizes corrected for noise in the 
      polygenic scores. 
    date: "September 2023"
You see the difference in emphasis.
6 comments

You are comparing apples and oranges, at least a bit. The latex equivalent is

  \documentclass{article}
  \title{Natural selection in the Health and Retirement Study}
  \author{XXX}
  \date{\today}
  \begin{document}
  
  \begin{abstract}
      I investigate natural selection on polygenic scores
      in the contemporary US, using the Health and Retirement       
      Study. Results
      partially support the economic theory of fertility as
      an explanation for natural selection: among both white 
      and black respondents,
      scores which correlate negatively (positively) with education are
      selected for (against). Selection coefficients are
      larger among low-income
      and unmarried parents, but not among younger parents or those with less 
      education. I also estimate effect sizes corrected for noise in the 
      polygenic scores.
  \end{abstract}
  ...
  \end{document}


Everything else you have there in your preamble is about either adding capabilities or changing formatting, you don't show how that is achieved in the other markdown.

I think I get your point, but in practice that part doesn't really get in the way, and if you are doing the same thing over and over (e.g. for the same publication) it's just a template anyway.

I don't love Tex/Latex, but most of the other markdown comparisons that emphasize "it's simpler" are because they can't do as much. Which is fine until you need some of that capability.

It's absolutely true that you may need to customize things. And then you are stuck with the big quarto disadvantage: debugging a toolchain that typically looks like

    quarto -> knitr -> markdown -> pandoc -> [tex -> pdf | html]
and not knowing exactly where the error came from.

At the same time, the markdown defaults produce a nice, readable paper. The TeX defaults get you something that reminds you of Rubik's Cube and Duran Duran.

Is Lyx still around? I remember it had good defaults. Haven't used in ages and it had some installer issues but I got fairly comfortable writing latex papers without learning a ton of latex...

Ofc that was a major downside, something other markdown editors figured out - if you give people buttons that make it easy and you make it easy to learn, they will learn what they need.

I don't get the problem. If 99% of your documents need the same packages and formatting, then all you need to do in LaTeX is create a template (eg via Yasnippet in Emacs) or dump it all in a LaTeX class file and then import it in your frontmatter, and Bob's your uncle. There are many frustrating things about LaTeX but I don't see how this is one of them.
Probably that's true. But first, I don't know how to create a class file, or a template (if that is a LaTeX thing). And since I've never seen anyone else do this, I guess that most academics don't either.

Second, my point isn't just about the specific issue, it's that this issue reveals how TeX thinks about the world. It thinks you want to spend your time writing TeX. No, I want to spend my time writing English. Here's another example. This is how you embed an image in quarto - it's just markdown:

    ![Caption](path/to/image.png)
And here is how you do it in TeX:

    \begin{figure}[t]
    \includegraphics{path/to/image}
    \centering
    \end{figure}
Which of these is easier to memorize and to read past? Similar comments apply to tables, links, numbering and so on.
I understand your frustration. Maybe it helps to know where this problem comes from.

TeX is extremely powerful and lets you create arbitrary documents. This is the first time I heard of quarto, but apparently it makes a lot of choices for you that you understandably don't really care about.

Instead of developing quarto, one could have simply written a LaTeX class that defines a function like so:

    \newcommand{\image}[2]{\begin{figure}[t]\includegraphics{#2}\caption{#1}\label{#2}\centering\end{figure}}
Now you can just write:

    \image{caption}{path/to/image}
Of course, it is now much less flexible, as you cannot define a custom label or different placement instructions. But that is the price you pay for short and memorable syntax.

By the way, developing a LaTeX class is not necessarily hard. It is more or less a file whose name ends in `.cls` with all the commands that you typically put in your preamble. It just needs a header of three lines that define some meta data and also supports options. See here for an example: https://github.com/latex-ninja/colour-theme-changing-class-t...

You put it in the same directory as your main tex file or in the system wide TEXMFHOME or user-specific TEXMHFHOME.

More on creating your own class file:

https://www.overleaf.com/learn/latex/Writing_your_own_packag...

How I do it...

I keep a directory called LaTeX inside my home directory. Inside that I keep a file with all my frontmatter, myfrontmatter.sty (technically a package rather than a class), and also my biblatex file and a scan of my signature for signing letters. When I start a new LaTeX document I add the line \usepackage{/home/nanna/LaTeX/myfrontmatter} to the top (note, no .sty). This keeps my frontmatter minimal and tidy.

Inside myfrontmatter.sty:

  \NeedsTeXFormat{LaTeX2e} 
  \ProvidesPackage{/home/nanna/LaTeX/myfrontmatter}[2015/01/01 by me]

  \RequirePackage{amsmath} % Just replace `usepackage` with `RequirePackage`
  \RequirePackage{amsthm}
  ...

  \addbibresource{/home/nanna/LaTeX/biblatex.bib}
  ...

  %% Macros like for inserting my signature
  \newcommand{\mysignature}{\noindent\includegraphics{/home/nanna/LaTeX/signature.png}}
  ...
  \endinput % Not sure if this line does anything?
And that's it. I never have to worry about a package I've forgotten to add in. Granted a journal might not accept my custom package but I can always just copy and paste it all into my frontmatter, minus the top two lines and replace all the RequirePackages with usepackages.
Yes, and with tools like ChatGPT it's even easier to write TeX documents.

I exposed this very problem and ChatGPT proposed exactly the same solution, and also another one using a custom environment.

That is just an expression of the LaTex problem though.

People (as in, the majority of people) will not be comfortable using a tool that is so unintuitive and hard to use that you need to use an AI to help you in writing.

Writing a document is not supposed to be hard and require assistance to do.

The problem is that your goals and skills don't match the purpose and capabilities of the tool, not that the tool is insufficiently "intuitive".

Manuscript composition used to be: write your document by hand or with a typewriter, handwrite some notes in the margin, throw in some pages with your figures on them, then let a professional typesetter take care of all of the technical details of making a typeset document for printed output. This was a whole separate career, and the typesetter would sink almost as much time into making your document look pretty as you put into writing it.

If you are using LaTeX, you are taking on the role of the professional typesetter yourself, and you need to make some specific technical choices to get some output from it. This can be a problem if you are inexperienced and don't know which choices to make or in a hurry and don't want to make any choices, but is also good insofar as it lets you actually produce a professional quality document if you have the time and expertise to do so. The difficulty involved is at least an order of magnitude less than doing composition of metal type.

If you are using markdown (or whatever), you are just punting on having a professional document at the end, and/or letting a system make all of the choices for you (often badly), or perhaps expecting to still hand off your document to a professional at the end for proper typesetting.

> with tools like ChatGPT it's even easier to write TeX documents.

I am not quite sure whether this is satire?

Thanks for the help and I can feel the enthusiasm. I have to tell you, my hatred for TeX is profound and goes far beyond this one point. But if I start ranting, I'll never stop.
Didn't you say with quarto you had to debug a 5 layer pipeline? I wonder if it's not biasing you here a bit...(stuck fighting "arcane" latex syntax somewhere at 3 in the morning).

I'm not saying you should love TeX, but it's a bit like saying you hate assembly language - if you have the wrong abstractions (writing a 3D game or a web page using assembly language) of course the experience will be beyond frustrating. I don't hate assembly language, but I generally don't need to touch it because higher order abstractions generally suffice. If I am optimizing my compiler output, though, then it's a tool I can use.

Ofc if I have the wrong or missing tools while using assembly language, or any other TBH (python, html, etc), that is also a source of considerable frustration. Not sure where the "hatred" comes from, but perhaps you encountered a poorly done package or editor?

> Instead of developing quarto, one could have simply written a LaTeX class that defines a function like so:

If it's so simple, why isn't there QuarTeX that does all of that and removes the extreme verbosity barrier?

Because doing something like this is opinionated. The LaTeX developers try to do the opposite: they want to provide packages that cover as many usecases as possible.

And users of LaTeX are probably not knowledgeable enough or too busy to publish their opinionated subset of LaTeX as a class. I don't know for sure. There is no central body that has an interest in removing barriers, so you might as well ask me or yourself why I or you haven't published anything.

I developed something similar to this at my company, because we write lots of LaTeX documents and need shortcuts like this not only for brevity but also so that we achieve some consistency across the entire team. It's only for internal use though and thus not public.

> they want to provide packages that cover as many usecases as possible.

How would this macro exclude the coverage of other use cases? The language primitives would still be there to do the original code

> There is no central body

That's true for plenty of other syntaxes that at the same time aren't like this

> in quarto - it's just markdown:

    >    ![Caption](path/to/image.png)
> And here is how you do it in TeX:

    >    \begin{figure}[t]
    >    \includegraphics{path/to/image}
    >    \centering
    >    \end{figure}
This is not the same thing! The LaTeX equivalent to your markdown would be

     \includegraphics{path/to/image.png}
which is arguably simpler and cleaner than the markdown. The figure environment is unnecessary when you just want to put a figure right there. You only need the figure environment when you want your image to "float" to a random place in your page.
> You only need the figure environment when you want your image to "float" to a random place in your page.

Which is also something the Markdown version can't do at all (give fine control over how the image is positioned). You have to use raw HTML plus probably some CSS if you want that.

God, the amount of pain I've had trying to use that "fine-grained control".
I can tell you as another academic i almost always get links and images wrong in Markdown (which of title URL is in square which in round brackets, forgetting the !, the conventions around file paths (some Markdown processors need file://...). Admittedly if I would write always one Markdown style I would get used to it, on the other hand I never get it wrong in latex and let's not even talk about how to do different alignment, captions and labels.
> I don't know how to create a class file, or a template (if that is a LaTeX thing). And since I've never seen anyone else do this, I guess that most academics don't either.

Exactly, academics usually don't do that - they write the text with appropriate markup, and then put it in the publisher's template and the formatting according to the appropriate standards is done. You can write your own template, but usually you use someone else's, with the big benefit that you can generally move your content to a very different template of a different publisher with minimal or no changes to your actual writing.

Now how would I do that in quarto - what (and how much) would I need to write to ensure that, for example, the captions for all the images and all the references to the images are all formatted in a specific manner? Because for quarto I would need to make my own template specifying the exact formatting and layout, and a quick browse of its documentation didn't lead me to any examples on how I would control that.

In sane environments there is a split between text and formatting, however, the formatting part has to be sufficiently powerful to meet the various requirements, so there is a certain quite high minimum bar to meet there. Latex works because I can rely that I will be able to easily get my markup laid out exactly as required by arbitrary standards, for any markdown-type standards I need some assurance that this will be possible and easy, that I won't need to (for example) go over all my references and do something to them.

Again, apples and oranges. Yes it's more markup than e.g. markdown (which is fundamentally less capable) But how do you do the equivalent ot the [t] and \centering in the former on a per figure bases? what about scaling it differently from other figures in your doc, or embedding a reference in caption with a particular style?

For that matter your equivalent is still one line, it's just \includegraphics{path}. The figure environment is just adding extra capabilities.

I agree not everyone needs to do this, but the trade offs you are illustrating are not "X is better than Y" so much as "X is simpler than Y, and can't do as many things"

For you that trade-off makes sense, great. But I wouldn't generalized it to the value of the tool. I know plenty of academics who are quite proficient at Tex, let alone the simpler Latex, and find it lets them generate the content they want easily enough, given it's power.

This isn't just mathematicians either, though most of the people I know using it came to that out of a need to do math typesetting properly. How would you for example generate a mixed language document with both left-to-right and right-to-left languages formatted correctly?

LaTeX's real problem isn't the syntactic load (easily handled with a decent editor) it's the package system. It can be abused to e.g. generate conference posters well, but it's hairy once you get into the details.

That’s a great of the tradeoff. On the surface the latex version looks harder, but you can specify a caption, how the figure floats with other items, how it’s justified, the zoom level, you can add a reference label that you can hyperlink to from elsewhere in the doc, etc etc etc.

The markdown one you get what you get. Maybe that’s fine. If it isn’t you are out of luck.

The latex one requires more of you but gives you much more functionality in return.

Which is better is going to be entirely situational/personal preference.

   \documentclass{article}

   \title{Natural selection in the Health and Retirement Study}
   \author{XXX}
   \date{September 2023}

   \begin{document}

   \maketitle

   \begin{abstract}
   I investigate natural selection on polygenic scores in the contemporary US, using the Health and Retirement Study.
   Results partially support the economic theory of fertility as an explanation for natural selection: among both white and black respondents, scores which correlate negatively (positively) with education are selected for (against).
   Selection coefficients are larger among low-income and unmarried parents, but not among younger parents or those with less education.
   I also estimate effect sizes corrected for noise in the polygenic scores. 
   \end{abstract}

   \end{document}
The difference is you know the easy way to get what you want in rmarkdown/quarto but only know the hard way in latex?

My latex docs look nothing like that because I put all the boiler plate in a .sty file.

Makes me wonder if Nota did work with Rust, then we could move the dependencies into a cargo.toml file and compile our documents. That would enable declarative macros for document generation at compile time as well as interactive rust code inside your document at reading time. Plus you could refer to the dependencies by their package name like amsmath::line or something
You might be interested in typst. It offers an incremental live compiler.