|
|
|
|
|
by kevincox
1544 days ago
|
|
> Native UTF-8 in memory makes character indexing a non-constant time operation The only reason that Java's UTF-16 has constant time indexing is because they use a braindead definition of character which is "UTF-16 codepoint". If you want constant time character indexing you need to go UTF-32. But obviously the downsides are too great for most users. So in practice everyone uses UTF-8 because it is usually the most memory efficient. Plus it turns out that character indexing isn't actually that common of an operation, so it is really the right move for almost every application. |
|