|
|
|
|
|
by xmprt
1605 days ago
|
|
This is a cool article about Unicode encoding however I still feel like it should be possible to reverse strings with Flag emojis. I don't see why computers can't handle multi rune symbols in the same way that they handle multi byte runes. We could combine all the runes that should be a single symbol and make sure that we're maintaining the ordering of those runes in the reversed string. Of course that means that naive string reversing doesn't work anymore but naive string reversing wouldn't work in the world of UTF-8 if we just went byte by byte. |
|
I actually don't know how/why Python is apparently using code points, since they are variable length. That seems like a compromise between using code units and using grapheme clusters that gets you the worst of both worlds.
Edit: Maybe it uses UTF-32 under the hood when it's doing array operations on code points?