|
|
|
|
|
by kevindamm
673 days ago
|
|
It's a design decision. On one end, if I'm reading your question correctly, you could use 0xFFFD (the replacement character) for anything not recognized as language-specific characters in the BMP and SMPs (this can be done within practically all existing Unicode libraries by filtering on character class) which will inadvertantly filter some non-emoji symbols and doesn't really convey any information (it can even look unprofessional, it reminds me a lot of the early web during the pre-unicode growing pains of poorly implemented i18n/l11n). There are libraries like Unidecode[0py] [0go] [0js] which convert from unicode to ASCII text that might be easiest to include in a TUI. All the ones I looked at will convert emoji to `[?]` but many other characters are converted to that, too, including unknowns. On the other end you can keep a running list of what you mean by emoji[1] and pattern match on those characters, then substitute for a representative emoji. But it will still pose some difficulty around what to choose for the representative symbol and how to make it fit nicely within a TUI. An example of a library for pattern-matching on emoji is emoji-test-regex-pattern[2] but you can see it is based on a txt file that needs to be updated to correspond with additions to Unicode. [0py]: https://github.com/avian2/unidecode [0go]:
(actually there are a few of these) https://pkg.go.dev/github.com/gosimple/unidecode [0js]: https://github.com/xen0n/jsunidecode [1]: these aren't really contiguous ranges, and opinions vary, see https://en.m.wikipedia.org/wiki/Emoji#Unicode_blocks [2]: https://github.com/mathiasbynens/emoji-test-regex-pattern |
|