Hacker News new | ask | show | jobs
by sterlind 1547 days ago
it's one thing Unicode does well. I haven't come across a living natural language, that's not represented in Unicode, but some extinct languages are too. there's hieroglyphics (classic and demotic), Sumerian, etc. Mayan and Aztec logosyllabary don't seem to be officially allocated yet, but there's a proposed range for them.
2 comments

We even have Unicode codepoints for alphabets like Shavian or Deseret which were never in widespread use but currently exist as linguistic curiosities.
I suppose that's the point, it's not just Unicode being quirky, imagine trying to publish a paper on language X and its unique, dead, script. It'd be harder now in the 21st century without it being in Unicode than it would've been in the early 20th (without Unicode existing at all). What would you do? Append an image and refer to the characters numerically?
Probably, preferably as a vector file, and then reference it with latex (or whatever you're typesetting your paper with) so it shows up as part of the rendered document. I.e. same way you include non Unicode items in your paper.
There's ~130 known unencoded scripts, about 70 historical and 60 modern (some of which are extinct, most of which just don't have that many users). Most of these have proposed ranges, but aren't actually in unicode yet. See https://linguistics.berkeley.edu/sei/index.html, which is the main group working to get the rest finished.