Hacker News new | ask | show | jobs
by felixfbecker 2658 days ago
I'm sorry but Latex is an inconsistent touring-complete mess. Latex commands are far from intuitive, otherwise Detexify wouldn't be so popular. It's a million macros held together with duct type with no consistency in naming and syntax, and always dependent on which age-old package you're referencing. It's good at outputting pixel-perfect printable, non-accessible PDFs, and that's it.

I'm excited by this because I hope that with a proper widely-supported system for math in HTML we can eventually write more papers to be digital-first. An HTML document is so much more accessible, searchable, semantically analysable and flexible than Latex and its PDF output. I dream of a world where the standard for papers is not Latex .pdf but .mhtml or .maff.

3 comments

It is true that the LaTeX ecosystem as a whole is a mess of packages and macros. But most of its mathematical typesetting comes from the underlying TeX (and a set of macros maintained by the AMS), and it's fairly small and consistent. The Detexify you mention is only for looking up specific symbols provided by various fonts (packages), and has nothing to do with mathematical typesetting or LaTeX macros in general: TeX/LaTeX engines support Opentype fonts now and if you want to use one of them, you can just type ∞ instead of \infty or ℝ instead of \mathbb{R} (actually you can do this regardless with unicode-math), bypassing the need for looking up symbols-a4 or Detexify.

Encoding the aesthetics of good mathematical typesetting is not trivial, and Knuth and others have spent decades on it based on studying and absorbing all the tricks that hot-metal typesetters had come up with over centuries. It would be foolish to throw away all that hard-won knowledge and implement half-baked solutions from scratch: those working in the field understand this (though the original MathML proponents perhaps did not), which is why the linked post mentions “math rendering based on TeXbook’s appendix G”.

More generally, in this conversation (and in any discussion about MathML), several things get conflated:

1. What syntax the user types to get their mathematics. I think it's beyond dispute here that no one wants to type MathML by hand (and even the MathML advocates do not propose it AFAIK). Also, so many people are familiar with TeX/LaTeX syntax that it must be supported by any complete solution, though alternatives like AsciiMath or some interactive input are still worth exploring.

2. How the mathematics is actually encoded in the HTML file, or delivered to the browser. Frankly I don't think this one matters much because it's invisible to the user; any of raw TeX syntax, or MathML syntax, or fully-positioned HTML+CSS, or SVG, will probably do.

3. Who does the rendering and typesetting / layout. The promise/dream of MathML is that a standard will be specified and all browsers will implement it; though this is yet to become reality. Meanwhile, typesetting can already be done server-side (someone runs TeX/MathJax/KaTeX/etc before sending it to the browser) or client-side (MathJax/KaTeX running in the user's browser) instead of being done in the browser's native code.

4. The quality of the typesetting/the algorithms used. I already mentioned this in the second paragraph above so I won't reiterate it, but this has been mostly underestimated/ignored by those advocating MathML. The decisions made by TeX reflected the best journals of the early 20th century and have in turn become the shared aesthetics of the mathematical community; “so-so” typesetting will not do.

5. What the result/output of all this rendering/typesetting/layout will be, in the web page's DOM. This affects things like usability (being able to copy-paste), scaling/zooming, styling individual parts of formulas, etc. Again, already (La)TeX+dvisvgm supports SVG for this, and MathJax supports HTML+CSS, MathML or whatever. Anything other than raster (PNG etc) images is probably fine here.

The main new/useful thing I can see with MathML is with (3); the browser doing the typesetting. But that's hard, and it has a lot of other challenges to overcome too. And as MathJax/KaTeX/dvisvgm demonstrate, the facilities provided by the browser for layout (HTML+CSS for example) are already sufficient for print-quality typesetting.

This is an interesting, thought-provoking comment!

> (though the original MathML proponents perhaps did not)

FWIW I'm pretty sure that they did. Arguments to authority are pretty terrible, but if you look at the authors of the MathML 1.0 (earliest) or 3.0 (latest) specs[0][1], and google them, you can see that many of them have backgrounds in science or math and have been active in the LaTeX ecosystem.

> but this [quality of the typesetting] has been mostly underestimated/ignored by those advocating MathML.

I don't see any evidence for this, not among its designers, implementers or even general proponents.

Firefox's output (implemented almost(?) entirely by individual volunteers), for instance, is acknowledged to be still considerably worse than LaTeX output in a pdf, though it is competitive with its web alternatives (superior in some respects, worse in others) — do be sure to install MathML fonts[2] though.

> 5. What the result [...] will be, in the web page's DOM.

Have you seen the tag soup generated (by necessity) with MathJax or KaTeX?

[0] https://www.w3.org/TR/REC-MathML//TR/REC-MathML/

[1] https://www.w3.org/TR/MathML3/

[2] https://developer.mozilla.org/en-US/docs/Mozilla/MathML_Proj...

I guess when I say “MathML proponents” I ought to be more careful in my thinking to make a distinction (even if they are often the same people) between those working on MathML as a project, and those advocating for its actual use today under current conditions. I have no problems with the former; I wish them good luck and look forward to trying the result when it's ready. For the latter, I can only say that anyone advocating using MathML today despite its problems (poor layout and browser support) clearly cares about something else more than they do about actually communicating mathematics to humans.

No doubt among the authors of the MathML specs there are people who care about typesetting. Though I'll note that being active in the LaTeX ecosystem is not a guarantee of this: the prime example is the author of LaTeX (Leslie Lamport) himself, who makes a pitch for the LaTeX model around (mostly) not caring about the appearance: https://lamport.azurewebsites.net/pubs/document-production.p... — in contrast with Knuth who devotes the largest (despite smaller font size) chapter of The TeXbook to Fine Points of Mathematics Typing. (A blog post by one of the authors of the MathML spec you linked to: https://blogs.msdn.microsoft.com/murrays/2011/04/30/two-math...) In fact some of the worst mathematical typesetting I've seen is by people who wrote in LaTeX and blindly trusted it to produce the best typesetting, and sometimes even ignored the warnings about overfull/underfull lines.

Looking at the MathML sample page in Firefox (https://mdn.mozillademos.org/en-US/docs/Mozilla/MathML_Proje...), there are many that are worse and none that is better that TeX's output (which for some reason is given on the page in low-resolution images rather than high-dpi images or SVG) — and in any case if you feel that some are subjectively better, it's still the case that the aesthetic most everyone wants is “like TeX”. And personally I've seen very little by Firefox/MathML people on their layout decisions (if they decided to do things differently from TeX, why?), while with MathJax I've seen that if their output is found not to match TeX's it is treated as a bug report and a fix attempted. What are the some respects in which you say Firefox's output is visually superior to MathJax's? Is there a page demonstrating them?

> Have you seen the tag soup generated (by necessity) with MathJax or KaTeX?

Yes, and it's not pretty. But (1) among the list of things to care about this is the lowest of the low, as it does not affect what is visible to the user, and (2) the tag soup of MathML, though shorter, has still all the XML ugliness so it's not as if it will ever be readable. (In fact looking at the comparison of different input formats https://en.wikipedia.org/w/index.php?title=MathML&oldid=8864... for sufficiently complex equations even the TeX/AsciiMath inputs become unreadable; the well-typeset visual representation may be the only somewhat readable one.)

> For the latter, I can only say that anyone advocating using MathML today despite its problems (poor layout and browser support) clearly cares about something else more than they do about actually communicating mathematics to humans.

Much of the reason people are in favour of providing MathML output is that if nobody did, then the likelihood of Chrome getting MathML support would have been near zero (and the risk of Firefox removing its existing (if imperfect) implementation, high). (A chicken and egg problem.) Since MathML is likely (you may disagree — but you must agree that it's plausible that if a JavaScript solution can do it, a native one can probably do it better) to lead to better equation typesetting on the web, once it's properly implemented, people advocating MathML today can very much care about better communicating maths to humans, in the long run. Also, MathML is not incompatible with MathJax — in fact, since MathJax's internal representation of equations is similar to MathML's, converting from MathML (to SVG/HTML+CSS/whatever) is faster than converting from TeX — so it's not as if providing MathML in the document dooms your readers you to its "ugly" output. (And yes, MathML can be both an input and an output for MathJax...)

> [...] (which for some reason is given on the page in low-resolution images rather than high-dpi images or SVG) [...]

That's because the page was made several years ago, when such images were the norm (look at a Wikipedia page on the Archive from even 2016), and hasn't seen significant updates since, because MathML was feared dead (due to Chrome, at the time, explicitly rejecting support) and volunteer effort dried up. Somebody™ should update this...

> Looking at the MathML sample page in Firefox [...], there are many that are worse and none that is better that TeX's output

For the little that it's worth, IMO five are worse, two are slightly better and the rest just as good as the TeX.

> And personally I've seen very little by Firefox/MathML people on their layout decisions (if they decided to do things differently from TeX, why?), while with MathJax I've seen that if their output is found not to match TeX's it is treated as a bug report and a fix attempted.

Far more people work on MathJax than on MathML — MathJax has a two-member team, plus support from AMS, plus volunteers, while MathML has only had occasional volunteers (see above; not that it was much better previously), so it's not a fair comparison. Also the page comparing TeX with MathML was made by the people in favour of MathML precisely because they care about trying to achieve parity with TeX.

> What are the some respects in which you say Firefox's output is visually superior to MathJax's?

For instance the speed with which the output is rendered. Several second latency for better final appearance might be a bargain you want to take, but it's still a visual trade-off. (You can use server-side MathJax, but then you lose some end-user customisability.)

> the tag soup of MathML, though shorter, has still all the XML ugliness so it's not as if it will ever be readable.

How else would you expose the structure of an equation, to the DOM, than with XML ugliness? It's just important that the XML ugliness makes sense (and MathML's does).

If Latex can be rendered in HTMl, than the world you dream of could become reality.

Nobody wants to learn a new type setting language. Latex is much less hard to learn and use than you imply

I mean just look at MathML https://en.wikipedia.org/wiki/MathML#Example_and_comparison_...

I think it can be, it's just nobody does it. I don't know what formulas get compiled to though - I would guess PNGs, at which point you lose the benefits over a PDF (at least for equations). With proper browser support for MathML it could always be MathML though, which would render sharp and be readable by a screen reader.

It is unfortunate that MathML is not easily hand-written like the rest of HTML. What I find interesting in the link you posted though is that they embed the Latex or StarMath representation inside the MathML. It should be possible have tooling in text editors such that you only edit that Latex representation and on save the MathML representation is updated automatically.

While LaTeX might be hard to learn and inconsistent in its entirety, the formula is quite consistent and a lot easier to write by hand than MathML.