|
|
|
|
|
by shellac
1544 days ago
|
|
Yes, that won't change anything internal. However: > ...JVM's internal string representation is UTF-16 Hasn't been try for a while. They switched to using a byte array internally for storage, plus an encoding. Currently that's either UTF-16 or Latin 1, unless compact strings are disabled in which case it's all UTF-16. |
|
Latin 1 has the special property that each of its fixed-width code units maps onto a single UTF-16 code unit. It is for that reason alone that CharSequence implementors can use it as an alternative to UTF-16. Imagine trying to implement `char charAt(int index)` if you're backed by a UTF-8 byte array (or UTF-32, for that matter)!
From a programmer's perspective, Java is pretty much as UTF-16 as ever.