|
|
|
|
|
by ender7
4659 days ago
|
|
This behavior is actually part of the ECMAScript standard [0], so it's unlikely that V8 (or any other conformant JS engine) would behave the way you (and many others) would want. JS's treatment of strings is even more wacky than you might think -- it is neither really UCS-2 or UTF16. Engines are semi-required to use UTF-16 representations of strings internally, but the API surface that is exposed to the JS code makes them look like UCS-2 strings (i.e. no surrogate pairs). However, if you stick a JS string into something that is UTF-16 aware, such as a DOM node, then the surrogate pairs will display correctly. See [1] for a very clear explanation of this muddy subject. [0] http://www.ecma-international.org/ecma-262/5.1/#sec-8.4 [1] http://mathiasbynens.be/notes/javascript-encoding |
|