|
|
|
|
|
by prodigal_erik
4948 days ago
|
|
They're facets of the same problem. I shouldn't routinely be dealing with either surrogates or combining marks; unless I have a specific reason, it's only an opportunity to make a mistake that hardly anyone knows how to troubleshoot. "n̈" should be an indivisible string of length one until I need to ask how it would actually be encoded in UTF-16 or whatever. |
|
Add a small number of people who for artistic reasons decide that they want to make life hard (Rinôçérôse I'm looking at you) and you just have to accept that the length of your string might not equal the number of codepoints contained therein...