| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lmm 4593 days ago
	> A naïve program may accidentally split a UTF-16 surrogate pair, but that same program is just as liable to accidentally split a decomposed character sequence in UTF-8. You have to deal with those issues regardless of encoding. The point is that using UTF-8 makes these issues more obvious. Most programmers these days think to test with non-ascii characters. Fewer think to test with astral characters.