|
|
|
|
|
by WorldMaker
995 days ago
|
|
> The combinations aren't infinite here. They certainly are. Languages are a creative space driven by the human imagination. Give people enough time and they'll build new combinations for fun or for profit or for research or for trying to capture a spoken word/tone poem in just the right sort of exciting way. You may frown on "Zalgo text" [1] (and it is terrible for accessibility), but it speaks to a creative mood or three. The growing combinatorial explosion in Unicode's emoji space isn't an accident or something unique to emoji, but a characteristic that emoji are just as much a creative language as everything else Unicode encodes. The biggest difference is that it is a living language with a lot of visible creative work happening in contemporary writing as opposed to a language some monks centuries ago decided was "good enough" and school teachers long ago locked some of the creative tools in the figurative closets to keep their curriculum simpler and their days with fewer headaches. [1] https://en.wikipedia.org/wiki/Zalgo_text |
|
We've got 150K assigned codepoints assigned, leaving us with 950K unassigned codepoints. There's truly massive amounts of headroom.
To be honest I think this argument is rather too abstract to be of any real use: if it's a theoretical problem that will never occur in reality then all I can say is: <shrug-emoji>.
But like I said: I'm not "against" combining marks, purely in principle it's probably better, I'm mostly against two systems co-existing. In reality it's too late to change the world to decomposed (for Latin, Cyrillic, some others) because most text already is pre-composed, so we should go full-in on pre-composed for those. With our 950k unassigned codepoints we've got space for literally thousands of years to come.
Also this is a problem that's inherent in computers: on paper you can write anything, but computers necessarily restrict that creativity. If I want to propose something like a "%" mark on top of the "e" to indicate, I don't know, something, then I can't do that regardless of whether combining characters are used, never mind entirely new characters or marks. Unicode won't add it until it sees usage, so this gives us a bit of a catch-22 with the only option being mucking about with special fonts that use private-use (hoping it won't conflict with something else).