| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chrismorgan 2222 days ago

Ugh, BI mangles the emoji badly: the breadcrumb turns each emoji into two of U+FFFE REPLACEMENT CHARACTER, which suggests something turns it into UTF-16 (that accursed encoding that ruined Unicode by making the always-doomed UCS-2 live longer even though the just about uniformly superior UTF-8 was already available, and which persists in distressingly many languages), then tries to turn it into UTF-8 by iterating through each UTF-16 code unit rather than each Unicode scalar value. Then the summary and body of the article just vanish the emoji altogether!

———

I’m fascinated to observe that Firefox Nightly on Windows is, when using a font stack that doesn’t include Segoe UI Emoji (e.g. the headline of the Business Insider article, and the body of https://xn--mp8hai.fm/statement, but not the header of the statement), not emojifying the first U+1F441 EYE, but emojifying the second. I can’t think of any way this could not be a bug. (Update: found a report from about a year ago, https://bugzilla.mozilla.org/show_bug.cgi?id=1567178 .)

Chrome is not emojifying either, which is reasonable when its font fallbacks hit some other font that includes EYE first.

For best results on things like this, include U+FE0F VARIATION SELECTOR-16 after each code point to say “emojify it if you possibly can”. (See also U+FE0E VARIATION SELECTOR-15, which says “render it in the old style without colour, please”.) Then you don’t need to worry about whether the font stack hard-codes system emoji fonts.

1 comments

xvilka 2222 days ago

One more reminder that UTF-16 should go away and everyone should use UTF-8 everywhere [1].

[1] http://utf8everywhere.org/

link