Hacker News new | ask | show | jobs
by faragon 3397 days ago
Sure, that's why I pointed it could be expanded anytime, because the encoding already supports its expansion, by design :-)
1 comments

The limiting factor on Unicode is UTF-16. There's only enough surrogates for 16 astral planes, which is why Unicode has 17 16-bit planes.
UTF-16 has reserved codes as well, so it could be expanded for covering 2^32 codes, too.
The range U+D800-DFFF is reserved for UTF-16 surrogates, specifically in two pairs of low and high surrogates. That means every surrogate pair can encode 10 + 10 bits of information, which is where the 16 astral planes (4 bits of 16-bit planes) comes in. Otherwise, there are just 128 code points in unallocated blocks in the BMP.

There is no space for expansion without reassigning private use areas or changing the encoding mechanism of surrogates--which is currently completely specified (each surrogate pair will produce a valid code point).