Hacker News new | ask | show | jobs
by tasogare 2051 days ago
UTF-8 has won on the web and file storage. For programming (C#, probably Java too, the Windows API) it’s UTF-16. Good point is it’s an Unicode encoding, bad point is that is variable-size, which look fixed-size for common characters.
1 comments

> For programming (C#, probably Java too, the Windows API) it’s UTF-16.

For programming it's UTF8, except for Windows-centric developers (and Microsoft technologies).

Java uses WTF16 internally but rarely assumes an external charset, and when it does, older API tend to use the default charset (which is generally ascii-compatible at best: on western windows it's commonly windows-1252, though it's been UTF-8 for years on most unices). Newer APIs like Files.newBufferedReader(Path) straight go with UTF-8.