|
|
|
|
|
by mike_hock
1600 days ago
|
|
WTF business do emojis have in Unicode? The BMP is all there ever should have been. Standardize the actual writing systems of the world, so everyone can write in their language. And once that is done, the standard doesn't need to change for a hundred years. What we need now is a standardized, sane subset of Unicode that implementations can support while rejecting the insane scope creep that got added on top of that. I guess the BMP is a good start, even though it already contains superfluous crap like "dingbats" and boxes. |
|
Unicode didn't invent emoji, they incorporated it because they were already popular in Japan, and if they didn't incorporate it, it would greatly reduce Japanese adoption.
Keep in mind that Unicode was intended to unify all the disparate encodings that had been brewed up to support different languages and which made exchanging documents between non-English speaking countries a nightmare. The term "mojibake" comes to mind [0] - Japan alone had so many encodings that a slang term for text encoded with something different than what your device expected (and subsequently got rendered as nonsensical/garbled text) came about. And they weren't alone, of course [1].
> What we need now is a standardized, sane subset of Unicode that implementations can support while rejecting the insane scope creep that got added on top of that.
Unicode wasn't intended to be pretty. It was intended to be the one system that everyone used, and a way to increase adoption was to do some less than ideal things, like duplicate characters (so it would be easier to convert to Unicode).
You may never need anything outside the BMP, but that doesn't make the rest of the planes worthless. Ignoring the value of including dead and nearing-extinct languages for preservation purposes (not being able to type a language will basically guarantee its extinction, with inventing a new encoding and storing text as jpgs being the only real alternatives), there are a lot of people speaking languages found in the SMP [2][3] ([2] has 83 million native speakers, for example).
[0]: https://en.wikipedia.org/wiki/Mojibake
[1]: https://segfault.kiev.ua/cyrillic-encodings/
[2]: https://en.wikipedia.org/wiki/Modi_(Unicode_block)
[3]: https://en.wikipedia.org/wiki/Chakma_(Unicode_block)