Hacker News new | ask | show | jobs
by fbdab103 972 days ago
TeX was definitely groundbreaking, but I consider it a product of its time. Far too much cleverness in macro expansions and weaving tricks to keep memory usage in check for what are now laughable limits.

Missing far too many niceties in comparison to modern languages with more guardrails to protect yourself from silly mistakes. The only way I can write Latex is to heavily rely upon \input{} segments to keep isolated blocks in case I break something through a missed escape.

I keep yearning for a modern take, but it feels like we are stuck in a local optimum from which there is no escape. New platform has to fight with the decades of accumulated inertia and packages which exist in Tex.

4 comments

There are plenty of these markup languages. The reason none of them really challenge tex/latex in its own space, is that they don't aim to do what tex/latex does.

Latex is "typographically-complete". Markdown and friends are explicitly not. HTML+CSS is. But what latex has is a reasonable enough syntax that a human can write it by hand, unlike HTML+CSS. Moreover, the syntax, though clunky [1] is designed, as much as possible, to not interfere with the content that the human is writing.

For instance, Latex uses curly brackets {} for macro arguments, because they are least used brackets for content. So when you are reading a latex source, you know that () and [] are content, and only {} are ambiguous [2]. Nota, uses a mix of all three brackets for its syntax, causing additional pain for the person reading/writing the source.

The replacement for TeX/latex is never going to a simpler language. It is going to a language just as complex as latex. But it can definitely be cleaned up and sped up compared to latex. IMHO, somebody should write tex from scratch, improve it's syntax but otherwise keep it largely unchanged. Basically, any plain latex source using some of the popular packages should continue to compile and give the same output. That is the only reasonable way out.

[1] A typographically-complete language will never have a non-clunky syntax.

[2] Escaped brackets \{1,2,3 \} are literal curly brackets. Personally, I only use them for mathematical sets and have defined a macro \set, so in my documents {} are 99% not ambiguous.

> what latex has is a reasonable enough syntax that a human can write it by hand, unlike HTML+CSS. Moreover, the syntax, though clunky [1] is designed, as much as possible, to not interfere with the content that the human is writing

I could not disagree more. LaTeX syntax is not 'clunky', it's a mess, and has intentionally been engineered right from the start to be clever rather than consistent. And it's not the syntax only, the obvious mess that is LaTeX's surface goes right on, right to the heart ("the guts" as TeXnicians prefer to say) of the machinery, where no concern is dealt with separately, and anything can influence and break everything else.

Hell you don't even get a semblance of sane text (string) processing or decent numerical computation. Yes, you can do it, in the way you could use a toothbrush or wet wipes to paint your house.

> Latex is "typographically-complete"

Yes as long as one is ready to ignore the fact that quite a few simple things are quite difficult to achieve in LaTeX, e.g. keeping lines the same height and keep register instead of jumping around whenever a superscript is encountered.

> The replacement for TeX/latex is never going to a simpler language. It is going to a language just as complex as latex.

The complexity of LaTeX is just in part due to the complexities of typesetting. It is complex because of an endless litany of bad design choices. HTML+CSS+JS gets a lot of flak for being too complex, but they pale in comparison. For example[1]:

In order to use numerical codepoints to write 東京, you can write any of:

    ^^^^6771 ^^^^4eac   
    \char"6771 \char"4EAC   
The space between the entities is used to signal the end of the codepoint number, hence to write 東 京 with a space you must use tricks, one of

    \char"6771{} \char"4EAC   
    \char"6771\ \char"4EAC 
In this system, ^^5c represents the backslash. But, unlike reasonable systems which TeX is not one of, using numerical reference doesn't deactivate the backslash's special role as command indicator.

Compare this to XML / HTML 東京 which is a much more reasonable syntax, not any harder to write, and uses an explicit end-of-command marker instead of the 'clever' space which is highly problematic.

[1]: https://agiletribe.wordpress.com/2015/04/07/adding-unicode-c...

> In order to use numerical codepoints to write 東京, you can write any of:

There’s a simpler way:

  \usepackage[utf8x]{inputenc}
  
  東京
Or better still, use XeLaTeX. But that's not the point. The point is that (1) sometimes you don't want the literal codepoint but a numerical reference in your source code; a use case for this would be ` ` instead of a literal ideographic space which might be useful to prevent it from being accidentally elided when at the end of the line.

(2) irrespective of whether you want to use numerical references or not, the example shows that apparently the authors of (La)TeX are unable to use sane syntaxes for their stuff. It's just a very bad idea to terminate your variable-length commands with a space when a space in the output could possibly follow. Same with identifiers: only letters are allowed, no underscores, no digits. You then get names like `\fooBarBazVI` instead of \foo_bar_baz_6 which many would prefer. These are all trifles to be sure, but they're legion, so you get a software that seemingly takes Death By a Thousand Papercuts as a positive design maxime.

LaTeX definitely has many messy parts that need to be cleaned up. Native support for unicode characters and bidi text (which is somewhat implemented by xetex), is mandatory in new-latex.

TeX engine obviously will need to be rewritten completely from scratch for the reasons you suggest.

Here's a modern take: https://typst.app/
For me, there is one fundamental issue which makes me want to switch from LaTeX -- it can't produce accessible documents (and good HTML would do just fine as accessible). LaTeX is making good progress in this area.

Amazingly (to me) it seems typst is doing even worse than LaTeX, while starting much later! I'm happy to be told they have succeeded in this area of course.

While the accessibility is an important area, it affects minority of the people. You need to get the core functionality first before you can spend resources on accessibility.

Latex is very old and has the features; they can focus on accessibility now.

No, I completely disagree. You need to design accessibility in from the start, it's almost impossible to retrofit. Very few systems manage to add high-quality accessibility later on.
I'm one of the Typst devs and I do agree with you here. LaTeX has a lot of trouble with accessibility because it's hard to retain semantic information through layers of macros. However, I think we are in a better starting position because Typst is designed to revolve around semantic elements that the compiler can actually understand. We haven't gotten to it yet (there's lots to do), but we want to use this information both to output Tagged PDFs and for semantic HTML export. I guess we'll see how it turns out!
I wish you the best of luck. I don’t have any time to get involved in any more open source projects, but I consider the lack of common accessible publishing formats for science one of the biggest embarrassments of academia — for a field that claims to be open, we sure seem to love churning horridly inaccessible PDFs (and yes, I’m as guilty as anyone else here).
You should design it is not necessary to implement from the beginning.
> I keep yearning for a modern take, but it feels like we are stuck in a local optimum from which there is no escape. New platform has to fight with the decades of accumulated inertia and packages which exist in Tex.

I believe the issue is that the better-than-LaTeX language needs to be not just better, but so much better that all the tooling and extensions for LaTeX are ported to it. Before this, it won’t be better than LaTeX. So it’s a kind of a chicken-and-egg issue.

TeX is still unparalleled in layout and all fine details. But writeing Tex/Latex every day is not the best way to do it.

The correct way to do it is:

Your favorite markup -> pandoc script and includes -> Latex