Hacker News new | ask | show | jobs
by ygra 4591 days ago
UTF-16 is just as bad as UTF-8 regarding variable-width code points. The only thing you always have (unless using compression schemes like SCSU) is random access to code units. Only UTF-32 also allows random access to code points. However, that's still of questionable value because when dealing with text you often want to handle graphemes, not code points, code units or bytes.