Hacker News new | ask | show | jobs
by AnotherGoodName 292 days ago
The line is really fuzzy honestly. We have some universal pictograms that are known and reasonably well understood around the world and the way they are used is pretty much a writing system. An icon of a man or women on a bathroom door? Well you may write it in one of a million different styles (fonts) but the general idea is used around the world as a common writing system. I'd say that belongs in unicode.

The real problem is that the alphabets of certain writing systems are unbounded. Emojis are completely unbounded. That's the only reason to have concern with it in unicode. Unicode is a limited set by definition and emojis are an unbounded set.

1 comments

In my opinion the job of the Unicode Consortium would have been to encode what has significant and organic usage. Similarly to how Wikipedia only includes what has significant organic and externally validated coverage. If they'd stuck to that mission the line would have been a lot less fuzzy.
The problem with that is, of course, that "significant" is subjective.

The modern Western society is very occupied with the questions of racial and gender identity, and it is generally accepted in that society that this topic is "significant". And since it's that society that the Unicode Consortium is working within, this explains how you get six different colors of "man-pregnant" emoji in the world where there possibly haven't been six different-colored pregnant men.

Significant is only subjective in the heat of the moment and not much in retrospect. What I am arguing for is that the Unicode Consortium should only add characters with what Wikipedia would call notability.

I would like to stress that I am not arguing against the addition U+1FAC3 PREGNANT MAN or U+1FAC4 PREGNANT PERSON, there are good reasons to add these, but do we need mundane arbitrary everyday items line U+1FAA9 MIRROR BALL? I'd say no.

Actually, I don't think there's a good argument to add either U+1FAC3 PREGNANT MAN or U+1FAC4 PREGNANT PERSON: expressing gender can already be done with a modifier like how skin/hair color and professions are also expressed, and we already have U+1F930 PREGNANT WOMAN.

For example, this is the unicode sequence for bearded lady:

  U+1F9D4  person with beard
  U+200D   zero width joiner
  U+2640   female sign 
So a pregnant man could simply be this expression:

  U+1F930  pregnant person (woman is implied by lack of modifier)
  U+200D   zero width joiner
  U+2642   male sign 
But no, instead we must have this combinatorial explosion of compositions because Unicode can't decide if it wants to be a symbol library or an expression library. So now, we have duplicates like U+1F40F ram and U+1F411 ewe, U+1F404 cow and U+1F402 bull, U+1F9D2 child and U+1F466 boy and U+1F467 girl (but a baby boy must be expressed as U+1F476 U+200D U+2642), U+1F468 man and U+1F469 woman and U+1F9D1 person, and U+1F385 Santa Claus and U+1F936 Mrs Claus but also U+1F9D1 U+200D U+1F384 non-gendered Claus.
The concept of notability is similarly subjective, the issues like the mentioned race/gender identity (or e.g. the Russian-Ukrainian war recently) are perceived as very notable on English Wikipedia.
I admit that it is not perfect. My argument is that Wikipedia at least tries to draw to line. There was a time even for Wikipedia, before the inclusionists vs deletionists conflict, when this was not the case. Unicode is still stuck in this phase and that is what I lament.