|
|
|
|
|
by Scorponok
4545 days ago
|
|
Depends what you mean by "better". UTF-8 generally ends up using fewer bytes to represent the same string than UTF-16, unless you're using certain characters a lot (e.g. for asian languages), so it's a candidate, but it's not like you could just flip a switch and make all javascript use UTF-8. |
|
UTF-8 is more reliable, because mishandling variable-length characters is more obvious. In UTF-16 it's easy to write something that works with the BMP and call it good enough. Even worse, you may not even know it fails above the BMP, because those characters are so rare you might never test with them. But in UTF-8, if you screw up multi-byte characters, any non-ASCII character will trigger the bug, and you will fix your code more quickly.
Also, UTF-8 does not suffer from endianness issues like UTF-16 does. Few people use the BOM and no one likes it. And most importantly, UTF-8 is compatible with ASCII.