|
|
|
|
|
by ninkendo
1637 days ago
|
|
One pedantic qualification: any byte except 0x2f (`/`) or 0x00. This actually rules out nearly any non-UTF8 character set (besides ASCII.) Quote from Linus, which reminds me of Henry Ford’s “you can have any color you want, so long as it’s black”: > And that one true format is UTF-8. End of story. If you try to talk to the
kernel in UCS-2 or anything else, you _will_ fail. https://lore.kernel.org/all/Pine.LNX.4.58.0402141827200.1402... |
|
It doesn't--pretty much any character set that has seen widespread use in the past few decades would be compatible. Any single-byte charsets that are ASCII compatible (such as most Windows CP* sets or the entire ISO-8859-* suite) would work. Most Asiatic charsets (e.g., EUC-JP, Shift-JIS, Big5, GBK) that use variable-width encodings follow the rule that characters in the 0x00-0x7f range are ASCII and subsequent characters in the 0x40-0xff range, and so are themselves compatible as well.
So actually the list of notable incompatible charsets is easier to write out: UTF-16, UTF-32, EBCDIC, and ISO-2022-* charsets (which are mode-switching).