Hacker News new | ask | show | jobs
by Ded7xSEoPKYNsDd 3082 days ago
> Indexing code points in both UTF-8 and UTF-16 requires reading the whole string up to index location. Substrings are the same as well.

Java's String functions don't index by Unicode code points, though. Java strings are encoded in UCS-2, or at least the API needs to pretend that they are.

2 comments

That can sure lead to interesting bugs!
Right. Same in C#, C++ STL, and in Apple’s obj-c/swift.
Swift takes a different approach than Objective C[0].

[0] https://www.mikeash.com/pyblog/friday-qa-2015-11-06-why-is-s...