|
|
|
|
|
by masklinn
4953 days ago
|
|
Which, at the end of the day, doesn't really matter since a code point is not a "character" in the sense of "the smallest unit of writing" (as interpreted by an end-user): many "characters" may (depending on the normalization form) or will (jamo) span multiple codepoints. Splitting on a character array is always broken, regardless of surrogate pairs. |
|