|
|
|
|
|
by DinaCoder98
822 days ago
|
|
I was pessimistic about grapheme-based orientation towards text, deleted it to research more, and I've come to the conclusion that this is simply not a consensus opinion. Can you give me an example where grapheme-based sorting makes a critical difference from codepoint-oriented sorting on a normalized text? Full unicode composition certainly seems to provide a reasonable solution with western languages, CJK characters, and romanization of CJK characters, but that leaves a hell of a lot of scripts that I don't know about. I mean unicode is incredibly complex, but it doesn't even seem like there's a consensus outside of swift's string implementation of what a grapheme even is. (Granted, this might support the above concept that people can't even agree on what a string is, but unicode code points seems like a reasonable baseline to expect from a modern language. That said, rust doesn't even include unicode normalization in the standard library, although the common crate for it seems like a reasonable solution.) |
|
In recent years, application support for this has greatly improved.