Hacker News new | ask | show | jobs
by Joker_vD 291 days ago
> Emojis aren't text

Neither are digits, or control characters, strictly speaking. We really shouldn't have been able to have CR and LF explicitly embedded in the text files.

2 comments

Neither are high and low surrogates - those are big ranges of code points that are illegal except for one specific (and not recommended) encoding (utf-16). Yet, there they will remain in Unicode.

Digits definitely are a form of text though. Unicode is for writing systems, which definitely includes writing numbers

CR & LF are in there for backwards-compatibility with ASCII. Similarly, the first emoji were include in Unicode for compatibility with some encoding systems used for SMS on Japanese mobile carriers. I wish the Unicode folks had drawn a hard line that they weren't going to add any more. If people wanted dingbats, they could go use a dingbats font.