|
|
|
|
|
by yuubi
2859 days ago
|
|
UTF-16 has a limit on the size of a code point because a code point is either a single normal code unit or a pair of surrogate code units, each encoding 10 bits of the code point (I think I used the right terminology). UTF-8 has a natural extension path to up to 7-byte encodings with all the usual UTF-8 properties (first code unit indicates how many remain, other code units are recognizable as not the first). |
|