Hacker News new | ask | show | jobs
by TheRealPomax 1823 days ago
The state of maths on the web, as someone who uses a lot of maths on their website[1] is to recognise that loading times on the web shouldn't exist, and you should just turn your formulae into images that you load using `loading="lazy"`. And of course, to make sure they fit any resolution: generate SVG images.

And no, MathML is irrelevant: you don't care about MathML, and your users don't care about MathML: all you care about is that users can read your formulae, and all your users care about is that they see decent-looking maths. As much as I like the idea of MathML, there is simply no reason to ever use it. Nothing is mining the web for maths, and semantic markup for maths buys you nothing.

You have a build system (because your content is generated from markdown or the like. No one who wants to deploy a real site writes pure HTML in 2021), make that generate SVG images by literally just running LaTeX during your build, to replace all your maths with SVG <img> code instead. Because why would you even bother with MathJax or KaTeX, they put the burden on your users, which is ridiculous: you're building your content already, just build static content for your formulae)

And sure, does your site have maybe 10 formulae? By all means, use MathJax or KaTeX. But if it relies on maths, generate your graphics offline using actual LaTeX (and this is trivial using github actions[3]) and use <img> elements that point to those SVG images.

(I run my maths through xelatex, then losslessly convert the resulting PDF to SVG by first cropping the PDF, then running pdf2svg[2]. Is that a lot of work? No, it is not. It's a one-time setup and it simply runs whenever content gets updated. It's about as no-effort as it gets)

[1] https://pomax.github.io/bezierinfo

[2] https://github.com/Pomax/BezierInfo-2/blob/master/src/build/...

[3] https://github.com/Pomax/BezierInfo-2/blob/master/.github/wo...

5 comments

> Nothing is mining the web for maths, and semantic markup for maths buys you nothing.

I find that rather unfortunate. A math search engine that find sites with equivalent formulae (or segments) would be quite useful to me.

Of course, that can probably made to work with image alt tags containing latex code.

What would you use it for? And that's a serious question: what would you use it for, as opposed to just using wolfram alpha or some other service that can already get you all the answers, analysis, and more, without having to mine MathML from random pages on the internet?
In the past, I used a math search engine[0] to find solutions for Olympiad problems, especially inequality ones. I imagine it would be useful when you want to find the name of some formulas or expressions that you came across, though probably not much more.

[0] My typical query: https://approach0.xyz/search/?q=OR%20content%3A%24a_%7Bn%2B1...

Semantic MathML is an absurdity, like marking up the tree diagram of all your English sentences, and linking all words to a URL with their dictionary definition. In short, the sort of think only the semweb wonks could have dreamed up.
I like this! It's similar to what I did when turning formulae of an ebook from gif to something prettier and more editable. The whole book only had a dozen formulae, so manual work actually covered everything. I used the codecogs editor - https://www.codecogs.com/latex/eqneditor.php - which can emit svg and png. I got it working fine, with svg fallback, in GitHub markdown and epub.
Accessibility? Being able to copy/paste the formulae into formula editor or solver? Being able to easily style the formula (including for dark themes)? ...
When was the last time you actually wanted to do that, rather than just wanting to hypothetically raise that possibility for the sake of an argument about web technology?

Because in reality, based on my experience at least, no one actually needs that. Folks can copy a formula that they got from an image into wolfram alpha just fine. And the folks who can't don't actually benefit from MathML: they benefit from the JS Selection and Range functions, when site owners take the time to make sure that text-selection of a formula image leads to a LaTeX formula being put in the clipboard, instead.

"That's way more work" but since we're all using build systems anyway: no it's not. Write once, thousands if not hundreds of thousands of users benefit. The end.

Surely MathML is better than an image for users of assistive technologies (screen readers, etc.).

That alone should be enough to favour it over just images.

You'd think, but not really: whether a screen reader reads out MathML or just reads out a proper alt attribute makes no difference.
Setting the alt attribute to whatever input generated the image should cover that.
I’ve come to the same conclusion, especially since I often end up using the same equations for both web and LaTeX documents, so it’s nice to have the same rendering engine everywhere, especially once you start making small tweaks to spacing etc.

One issue I’ve run into is that it’s not always that easy to get the style of the SVG images right, especially when it comes to sizing & placement. Did you find a good way to e.g. ensure that inline equations have the correct baseline alignment? Or a good method to ensure that equation sizes always match the font size of the paragraph they’re in?

I try not to ever use inline equations. Text is text, code is code, maths is maths, keeping it scoped to their own blocks tends to work better for readers.
> you should just turn your formulae into images that you load using `loading="lazy"`. And of course, to make sure they fit any resolution: generate SVG images.

One drawback to this approach is that SVG equations have a fixed layout, so e.g. they can't automatically line-wrap. Most of the equations on your Bézier page are pretty short, but I notice that you have elected to manually wrap them in some spots when they get too long, or in other cases just let them spill off to the side. This is most apparent with a narrow viewport (e.g. a phone), but you can also see this by using the responsive design mode on a desktop browser or even simply resizing the viewport. The longer equations get cut off on the right side and you have to scroll horizontally to see the whole thing, which isn't ideal. One of the benefits HTML is supposed to have over just, say, a PDF is the ability for the same document to reflow to different viewports.

The Bézier page illustrates another common issue with SVG images of equations: they have a lot of text in them, but none of it is searchable text. That means no Ctrl-F and no search engine indexing of that text. This is fixable via SVG, though, since SVG images can provide a searchable text layer. (I don't mean to single out your website, by the way, this is an issue with math all over the web. Also, selectable text is a longstanding bug in Cairo[0], which pdf2svg relies on to generate SVGs, so it's not an easy fix on your end anyway.)

> there is simply no reason to ever use it. Nothing is mining the web for maths, and semantic markup for maths buys you nothing.

MathML supports automatic linebreaking of equations. SVG doesn't. That's one simple reason to use MathML. I'm not sure whether this fits your definition of "semantic markup" or not, but it is useful. Linebreaking even has its own section in the MathML spec.[1]

It's also not true that nothing is mining the web for MathML. SearchOnMath[2], for example, indexes pages from the NIST DLMF[3], which uses MathML extensively.

[0] https://bugs.freedesktop.org/show_bug.cgi?id=38516

[1] https://www.w3.org/Math/draft-spec/chapter3.html#presm.lineb...

[2] https://www.searchonmath.com/about

[3] https://dlmf.nist.gov/help/mathml