Hacker News new | ask | show | jobs
by wannesm 5203 days ago
I love Latex but it (or more specifically TeX) is showing its age. It's perfect for writing a paper given a good template but writing and debugging these templates (.cls, .sty) is unnecessarily hard. For my job (in academia), I have to update such Latex templates on a monthly basis and always end up looking at the current improvements for Latex. This updating task involves not 'writing' in Latex but 'programming' in Latex. If you are used to modern programming languages then Tex is stubborn and hard, therefore, I always have the feeling there is improvement possible (I mean, Python, Javascript, C++, ... are all easier to debug). Although people are working hard on this and doing interesting work (i.e., LuaLatex, Latex3), I feel the underlying Tex language has had its time and a more drastic change might be necessary. Remember that Tex is designed for computers with an order of magnitude less resources.

An increasingly interesting Latex replacement is, maybe surprisingly, html in combination with css and javascript. With every update of our browser the inspection and debugging tools become more powerful and every time I can track layout and programming bugs a little bit faster. With the addition of more properties targeting paged media in css3 it now becomes possible to also create nicely looking pdfs starting from html. Prince (http://www.princexml.com/), for example, is ahead of the browsers for CSS paged media properties and outputs a pdf-file directly. But also typical features for which we praise Latex are becoming available:

- Mathematical formulae: http://www.mathjax.org/

- Bibliographies: http://citationstyles.org/

- Advanced hyphenation: http://code.google.com/p/hyphenator/

Most people only use the basic commands and don't care about the underlying engine. Therefore, Latex as a 'writing' language is not too attached to the Tex 'programming' language. Pandoc (http://johnmacfarlane.net/pandoc/) could be enough to translate such Latex code to another markup format and use another engine. (Academics tend to use advanced Latex macros only when in need of space ;-))

6 comments

I don't really mind TeX's age since the results of typesetting with it are very nice and there is practically no alternative at least in academia. I find it a bit hard to imagine that html can do the same job. The minor (or major) inconsistencies between html engines can lead to so many different renderings from the same source and I don't think that you want that for your texts. I agree though that typography in browsers has come a long way since the early 00s and is definitely in good track. The tools you are mentioning are nice but setting them up does not seem to be any less complicated than TeX and friends (not mentioning the fact that the hyphenated result [1] from Hyphenator looks so bad; it begs for margin protrusion, but that's a different issue)

In my opinion LuaTeX (and its LaTeX counterpart) is the future of TeX. It builds upon the strong base of TeX and combines it with Lua which is a fine language for this purpose.

[1] http://hyphenator.googlecode.com/svn/tags/Version%204.0.0/Wo...

Due to its age, there is a massive codebase in TeX/LaTeX. There is literally a package for everything. It would be a massive undertaking to replace the TeX system by something that isn't downwards compatible. And since the vast majority of LaTeX users don't get to see much of the underlying mess the pressure to do so is rather small.
Not necessarily. New people in academia learn LaTeX all the time, and if more students start getting upset about having to learn it while there's an alternative that, while lacking a zillion of packages, is easy to use and just works right out of the box... then I don't see how professors would be able to hold back the tide. I've seen engineers write papers in Microsoft Word, though they're a minority. And then there's those English majors who just cajole a buddy with InDesign into typesetting their thesis for them.
Students have always been upset about learning Latex. However, after the first or second paper they realize how much trouble Latex is saving them and become fans for life. I don't see any other system offering the same advantages.
Around here I've seen a considerable drop-off in TeX fans over the past 4-5 years, with more students opting for Word. It didn't used to be a viable option, but now many conferences (outside a few very-math-heavy areas) offer both LaTeX and Word stylesheets and let you choose, and Word has improved in a few key areas, mainly citation support and auto-hyphenation. Zotero users also seem to like the Word integration. Figure placement still sucks, but figure placement isn't really TeX's strong suit either.

I personally still prefer TeX, but the gap is smaller than it was 10 years ago. The auto-hyphenation is probably the single biggest change, since the easy way to spot 2-column conference papers done in Word used to be the horrible whitespace in justified columns caused by its inability to break words. (This heuristic still words sometimes, because auto-hyphenation isn't on by default, and not all authors know about it or use it.)

I'm curious. Why do you need to update templates so frequently? I would have expected academia to have a fairly static, albeit perhaps large, set of templates. E.g., at most one for each journal in your field, one for each conference, one for each publisher, with any of these only changing rarely.
That is correct for the general style. However, then you have to hunt for packages for pseudocode, nicer tables, diagrams, pdf bookmarks and hyperlinks, subfigures, etc.
\rant{I think it's funny that you say this, because I was just remarking to someone the other day that the primary way I've seen TeX and its little forest friends age is that it's become rather unwieldy IN SPITE OF the fact that the hardware it's running on now is far more than an order of magnitude faster.

In college, I was typesetting my work in PlainTeX (I never did like LaTeX, but obviously I had it available) on a 14.77MHz 68000-based Amiga 2000, and the TeX distro came on floppies. I had a whopping 40Mb hard drive, and all the heavy lifting lived there - Metafont, dvips, tex itself. But they fit comfortably on 800K low-density floppies and ran from them, if you needed to. The other floppies were all fonts, and since the prevailing format for fonts in the rest of the world was Type 3 Postscript (yucky bitmap) and comparable TrueType, my work looked rockin.

So to review, I had it running largely off floppies on a machine a couple of orders of magnitude slower (and a couple of orders of magnitude less memory and storage) than my iPhone. And I frequently taught freshman English majors who wouldn't own their own computer for another 3 years how to use it, down to font rendering and selecting an output format for the target printer which was rarely Postscript back then.

Riddle me this then: why are current TeX distros completely indecipherable to me now? I mean, kpathsea was always a bit of a beast, but I understood it pretty much at a glance. How is it that, although I've used the platform on and off for two decades now, in the last 5 years I've had to call the Psychic Friends Network every time I tried to call a package that I thought I had installed correctly? Oh, and why is a whole install now larger than the sum total of all the storage I had at my disposal - every floppy, hard drive, mainframe quota, and gettemp limit - when I last used the system on a daily basis?

As far as I can tell, the last update to the core product was in 2008, and everything that's been added to the main engine since 1992 has been incremental support for things like modern font formats. So it should have grown linearly, not exponentially. But there it is. Big as life and twice as ugly.

This is actually the second question in five days I've seen in two different fora about, "How is TeX holding on?" And to look at the sample output that was produced by Lout, obviously the answer is, "Because no one ever came up with a replacement that produced better output." You don't have to ask Don Knuth to figure that one out. It's not that Lout hasn't surpassed TeX yet. It's that it hasn't beaten troff yet. The 70s called, and they're looking for their DEC LP01, man.

But I don't think people are actually voicing the question in their heads. I think the question they're actually asking is, "Who let this godawful piece of Frankencode run through the village terrorizing the children, and why won't someone please scrape it all into a pile and teach it how to sing Puttin' on the Ritz like it did 20 years ago?" Or, "If you got this thing back into shape, why wouldn't it be the rendering engine for ebooks, because if it's setup right, it can render a whole book from source live on an iPad which is 100x more powerful than its original compile target?" I can think of 20 questions like this. All the questions ultimately boil down to a wonderment that one of the best pieces of software ever written for making readable output is cared for so shoddily. It's like some laboratory experiment gone amuck on how layering bad abstractions on things makes even awesome things awful. }

And now for my next trick, I'm going to go integrate XeTeX into my current product to generate custom typeset results for customers. No, seriously, I am. I see 20 more years of this platform in my future...

I haven't looked into why TeX distributions (looking at you TeX Live) are so big these days, because it mostly doesn't matter. I suspect that I could take almost everything I write today and typeset it on your Amiga without too many problems.
You don't know why TL is so big? Because it includes everything on CTAN that is compatible with DFSG. CTAN is an enormous archive containing numerous packages that do the same thing slightly different. How many packages are there like booktabs/tabulary/tabularyx/longtable etc? How many enumitem like packages are there?

What is so confusing?

"TL is big because it includes everything possible" isn't the confusing part, what's confusing is why it has doubled in size every couple of years. Is that all from CTAN? The other why could be: why do they include everything when they have a perfectly good package manager? I don't know if TL will do on-demand loading, but MikTeX on Windows did so years ago, so it should be possible.
The documentation for the different languages is huge. Plus all of the architectures. When I install I only install english/linux-x86-64 and its not that big. Plus fonts do not compress that well..
I tried [0]. Firefox does an ok page layout, if you print it. I did not try Prince.

One big TODO is the copyright notice in the bottom left column first page. Two column-wide figures are also an issue.

Thanks for citationstyles.org. When i have time, I'll try to integrate that.

[0] http://beza1e1.tuxen.de/acm_html/test.html

Prince already supports the new page floats attributes. A copyright notice in the ACM style works for me with "float: bottom;". The w3 css3 draft [1] has some more examples, also about two column-wide figures.

[1] http://www.w3.org/TR/css3-gcpm/#page-floats

Why don't you link your articlecls [0] here? ;)

http://wannesm.be/articlecls/

There are a lot of alternatives for the easy stuff. TeX makes the harder stuff like bibliographies, cross references, figure and equation numbering, indices and aligning equations relatively easy. As a mathematician, I would need to see the equivalent of "Math into LaTeX" before I would even consider switching.