|
|
|
|
|
by ISV_Damocles
274 days ago
|
|
UTF-16 is also just as complicated as UTF-8 requiring multibyte characters to cover the entirety of Unicode, so it doesn't avoid the issue you're complaining about for the newest languages added, and it has the added complexity of a BOM being required to be sure you have the pairs of bytes in the right order, so you are more vulnerable to truncated data being unrecoverable versus UTF-8. UTF-32 would be a fair comparison, but it is 4 bytes per character and I don't know what, if anything, uses it. |
|