|
|
|
|
|
by jcranmer
2387 days ago
|
|
There is no character in EUC-JP or Shift-JIS that is not in Unicode--the explicit goal of Unicode in its original formulation was to be able to losslessly round-trip any other charset through Unicode, and the initial version of Unicode incorporated the source kanji lists for the EUC-JP/Shift-JIS charsets in their entirety. |
|
The parent comment seemed to be implying that we should drop support for non-utf8 charsets.
To me, that rings like saying a website with 'charset=EUC-JP' (such as http://www.os2.jp/) should be broken, as in browsers should error out or display a large quantity of black boxes due to it using a non-utf-8 encoding.
I'm claiming the only reason the author thinks that's really viable is because in our western-centric world, we see mostly ascii and utf8. Things that, if you flip to only utf-8, both still look fine.
CJK websites, on the other hand, that are using the equivalent of ASCII will have to be manually upgraded to display correctly if browsers drop their support.
Sure, all their characters can be represented in utf-8, but there's large swathes of websites that will never be updated to a new charset, and it's only a western-centric view that can so blithely suggest breaking them all.