| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by noselasd 4688 days ago
	UTF-8 encodes unicode code points, so its unicode or some external entity that converts between character sets that have to deal with those issues, not UTF-8 UTF-8 would pretty much only need to be updated if the unicode standard redefines what a code point is (e.g. starts using floating point, decimals, imaginary numbers or something else that is also unlikely to happen)

1 comments

derleth 4688 days ago

> UTF-8 would pretty much only need to be updated if the unicode standard redefines what a code point is (e.g. starts using floating point, decimals, imaginary numbers or something else that is also unlikely to happen)

Or if they decide that they need more codepoints, so some invalid-but-possible UTF-8 byte sequences suddenly become valid.

link

masklinn 4688 days ago

There's no reason for that, UTF-8 is only there to encode Unicode codepoints, and the whole range of codepoints (including the 80% not yet attributed) can be expressed in UTF-8.

link