| > Pascal and old basic (among other languages) have been case insensitive for decades ... because back then Unicode was still a pipe dream in the mind of some visionary. Everything used 8 bit encoding, and everyone assumed ASCII or at least something compatible with the 7 bit subset of ASCII. Nowadays files are formatted in UTF-8, and most modern languages actually fully support UTF-8 identifiers. Nim itself supports UTF-8 "letters" in identifiers, and what is "upper case" or "lower case" 100% depends from the current locale. Restricting your case normalization logic to ASCII is __really bad__, because it basically means that non-Latin letters in identifiers won't be normalized, with possibly unexpected consequences. > APL is the only real language that doesn’t impose a western character set. - Rust uses UTF-8: https://doc.rust-lang.org/reference/identifiers.html - Go allows any Unicode letter in identifiers: https://go.dev/ref/spec#Identifiers - Swift is also famous for allowing you to use emojis in identifiers. - Python supports non-ASCII identifiers: https://www.python.org/dev/peps/pep-3131/ And the list goes on. Even C++ can optionally support Unicode in identifiers (for instance, Clang and GCC do indeed support things like `constexpr auto 黒 { "lol" };`). |
They could still have been case sensitive (C was), so I don't understand how that's relevant to the idea that "case insensitivity is a problem".
> Rust, Go, Swift, Python
All of these languages impose ASCII for their keywords and directives. They allow you to use other characters for identifiers, but impose ascii in everything that has pre-defined semantics. Original APL is the only "real"/practical language that I'm aware of that gave up the "western centric view" of the world to the point that it doesn't have a single English keyword. (Brianfuck, etc. exist as well, but ....)
And they all impose a left-to-right reading order, which is just as western-centric. Arabic/Farsi/Hebrew go right-to-left, and there are languages that can also go top-to-bottom.
I think the outrage about "western centrism" is misguided. This is a formal system, and just like math, it reflects some history by using latin letters and left-to-right for the predefined symbols, and even preferred use of latin characters in identifiers.
> Nim itself supports UTF-8 "letters" in identifiers, and what is "upper case" or "lower case" 100% depends from the current locale.
If that's true, that may be a problem. I'll look into that, thanks for pointing out - from memory, Nim only folds the lower 7-bit by a 32 difference in ascii code, so it is well defined regardless of locale, but I'll check.
The whole idea of utf-8 in identifiers is a minefield, whether you fold case or not; e.g:
"Εхаmрⅼе" and "Example" have no single letter in common (I chose them that way using[0]) and no language that allows utf-8 identifiers is going to warn you about that.
[0] https://www.irongeek.com/-attack-generator.php