|
|
|
|
|
by burntsushi
3385 days ago
|
|
> I thought that was an invalid code point. Surrogate codepoints are indeed valid codepoints. It's just that valid UTF-8 is not allowed to encode surrogate codepoints, so the space of codepoints supported by UTF-8 is actually a subset of all Unicode codepoints. This subset is known as the set of Unicode scalar values. ("All codepoints except for surrogates.") |
|