| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rcoveson 1592 days ago

You're talking about implementation details of java.lang.String. The interface it exposes is still UTF-16.

Latin 1 has the special property that each of its fixed-width code units maps onto a single UTF-16 code unit. It is for that reason alone that CharSequence implementors can use it as an alternative to UTF-16. Imagine trying to implement `char charAt(int index)` if you're backed by a UTF-8 byte array (or UTF-32, for that matter)!

From a programmer's perspective, Java is pretty much as UTF-16 as ever.