Hacker News new | ask | show | jobs
by ken 2291 days ago
Again, that's great and I understand that (I've studied Japanese), but that's only part of the new version. They're not adding pictures of "mousetrap" and "olives" and "toilet plunger" because any existing language needs to write these.

Furthermore, I'm really starting to question the way CJK is encoded. We don't make every English word a separate codepoint. 97% of these CJK ideographs are just different combinations of the same few radicals. Korean seems especially weird, as they have both individual radicals and every precomposed triple (in a block that's been rearranged once or twice, on the basis that nobody was really using it yet). I'm not saying we should nix all precomposed Hanzi/Kanji, exactly, because that's a very convenient way for programs to handle text, but it seems like this system is becoming increasingly awkward for non-western languages.

I feel there's a fundamental flaw when our "universal" text encoding system can't handle the regular creation of new words in a well-understood way, for languages spoken by 1/3rd of the world's population. It's like we're issuing hardware patches for a software problem.

1 comments

It is, I do not like the way CJK is being doubt with. Not to mention fonts dont include All the CJK variants of the fonts when I use the same word but need a JK variant because that is how it was suppose to be used.

Even the "C" has traditional and simplified variant.

Fortunately I think Unicode is pretty much done for Alphabetical languages. Someday if CJK design Unicode isn't good enough breaking it off to something better isn't entire impossible.