|
|
|
|
|
by jcranmer
1637 days ago
|
|
> This actually rules out nearly any non-UTF8 character set (besides ASCII.) It doesn't--pretty much any character set that has seen widespread use in the past few decades would be compatible. Any single-byte charsets that are ASCII compatible (such as most Windows CP* sets or the entire ISO-8859-* suite) would work. Most Asiatic charsets (e.g., EUC-JP, Shift-JIS, Big5, GBK) that use variable-width encodings follow the rule that characters in the 0x00-0x7f range are ASCII and subsequent characters in the 0x40-0xff range, and so are themselves compatible as well. So actually the list of notable incompatible charsets is easier to write out: UTF-16, UTF-32, EBCDIC, and ISO-2022-* charsets (which are mode-switching). |
|
Asiatic character sets are an interesting point though. I wonder how common they were at the time of what Linus wrote…