|
|
|
|
|
by cmccabe
4948 days ago
|
|
UCS2 allows random access to characters, UTF-16 does not. I'm not sure if that's really true. On IBM's site, they define 3 levels of UCS-2, only one of which excludes "combining characters" (really code points). http://pic.dhe.ibm.com/infocenter/aix/v6r1/index.jsp?topic=%... If you have combining characters, then you can't simply take the number of bytes and divide by 2 to get the number of letters. If you don't have combining characters, then you have something which isn't terribly useful except for European languages (I think?) Maybe someone more familiar with the implementation can describe which path they actually went down for this... given what I've heard so far, I'm not optimistic. |
|
See https://github.com/v8/v8/blob/3ff861bbbb62a6c0078e042d8077b2... and https://github.com/v8/v8/blob/3ff861bbbb62a6c0078e042d8077b2....