Hacker News new | ask | show | jobs
by othermaciej 4948 days ago
Safari uses UTF-16, not UCS-2. I believe this is true of other browsers as well. Otherwise this would render the replacement char, but it doesn't, it renders correctly:

javascript:var x = '𝌆';document.write(x);

1 comments

Well, a JS string is just a series of UTF-16 code-units (per ES5, there is no impl choice here), so there isn't really any encoding pre-se (and isn't necessarily a UTF-16 string, per the spec's definition thereof, as lone surrogates are valid). The fact that that works is more a testament to the the DOM being UTF-16 than JS.

(On the other hand, I'm sure you knew that. But probably there are people reading your comment who didn't. :))

You are technically correct, the best kind of correct! But I think we both agree there is absolutely no sense in which anything in browser engines is UCS-2, and that browsers will not in fact replace characters beyond the BMP with the replacement glyph, as the top-level comment claimed. It is kind of embarassing that the top rated comment (as of writing) but says completely false things.