Hacker News new | ask | show | jobs
by imron 3782 days ago
Actually, they are not combining code points. Take for example the character 𪚥 (4 dragons).

The codepoint is U+2A6A5, but in UTF16 it requires combining 2 utf16 characters (\uD869 and \uDEA5) in order to reference it.

The codepoint however is still exactly the same (U+2A6A5).

1 comments

> The codepoint is U+2A6A5, but in UTF16 it requires combining 2 utf16 characters (\uD869 and \uDEA5) in order to reference it.

No, you mean two UTF-16 code units. A character is one or more code points.