Hacker News new | ask | show | jobs
by exitcode00 2717 days ago
These are solved problems though and unicode identifiers are rare in practice...

See Normalization Form KC and Clause 21 of ISO/IEC 10646:2017.

"Normalization forms are the mechanisms allowing the selection of a unique coded representation among alternative; but equivalent coded text representations of the same text. Normalization forms for use with ISO/IEC 10646 are specified in the Unicode Standard UAX#15..." yada yada

1 comments

Unicode normalization doesn't actually solve a single problem I mentioned. All of the listed characters are equal to themselves in both NFC and NFKC.

Also Unicode identifiers aren't rare in terms of language support. Most of the popular languages support them--C/C++, C#, Java, PHP, Python, Perl, Swift, Go, Rust, Ruby, JavaScript, even Ada. It's actually difficult to find a popular language that prohibits Unicode identifiers entirely (MATLAB does, not sure about Visual Basic).