Hacker News new | ask | show | jobs
by panic 3257 days ago
The first problem isn't really a problem, since ligatures are provided by fonts, not character encodings. From the Unicode FAQ on ligatures and digraphs (http://unicode.org/faq/ligature_digraph.html):

"The existing ligatures exist basically for compatibility and round-tripping with non-Unicode character sets. Their use is discouraged. No more will be encoded in any circumstances.

"Ligaturing is a behavior encoded in fonts: if a modern font is asked to display “h” followed by “r”, and the font has an “hr” ligature in it, it can display the ligature. Some fonts have no ligatures, some (especially for non-Latin scripts) have hundreds. It does not make sense to assign Unicode code points to all these font-specific possibilities."

The second problem still stands, though, especially since these sequences of characters can be tokenized differently in different programming languages. IMO, if you're going to have character replacement like this, it should be a configurable editor feature like syntax highlighting.

4 comments

I don't know what it would do to editor rendering performance, but disabling the `liga` OpenType flag for selections detected as strings would solve the majority of instances of #2

All in all though, this is a purely local dev preference matter - your editor ligature settings never affect the committed code, so neither are really a problem in practice.

I'm a little miffed that the blog author doesn't have a good understanding on how Unicode and OpenType cooperates. Your first point is a great example of that.

As for your second point, I think what I outline here solves most of the problems that the other commenters are arguing about.

Actually, such features can be somewhat properly implemented in OpenType. You can tag these alternate glyphs as stylistic sets, with each stylistic set supporting so-and-so family of programming languages. Then, by default, an unaware editor would not perform the ligature substitution.

However, proper support will still require some standard awareness from the editor through some standard API, so that it selects the right stylistic set for the right text (e.g. comments v. code).

Emacs already does a form of character substitution through prettify; I use it all the time with LaTeX, and found it delightful to work with. It substitutes commands that stand for mathematical symbols with those mathematical symbols defined in Unicode. The limitation of this is that some of the features illustrated in Fira Code such as the prettified Markdown header don't have a corresponding Unicode code point, and thus necessarily has to be implemented as a ligature in a stylistic set.

A final note on the productivity of substituting input characters with more semantically representative symbols for display: when done well, it is not obtrusive and shouldn't hinder productivity. After all, the Chinese and Japanese do this all the time with a more clunky system (IME) in their digital input, and they get by well enough with it.

Isn't the choice of font inherently an editor feature like syntax highlighting?
I don't think they're talking about the choice of font, but instead the choice of replacing certain character sequences (e.g. ->) with glyphs (e.g. →).
This is exactly what I was going to say, and that one big issue I had when experimenting with ligatures for Hylang (a dialect of LISP) was that it did not keep the spacing of the original character combinations. While I thought the ligatures were much more concise, when they were disabled it shifted things around which ruined the indentation. It made the code ugly and hard to read for anyone not using them, so I had to give them up.

Hopefully one day we can all have language specific characters that make our code more concise. Until then I'll stick to fonts that keep the spacing of the original intact.