Hacker News new | ask | show | jobs
by MauranKilom 2238 days ago
> Extended characters in identifiers may now be specified directly in the input encoding (UTF-8, by default), in addition to the UCN syntax (\uNNNN or \UNNNNNNNN) that is already supported:

    static const int π = 3;
    int get_naïve_pi() {
      return π;
    }
Lovely!
3 comments

The next obfuscated code competition is sure gonna be interesting.
yup zero-width-space is going to a do a number on everyone.
For what is worth being able to enforce conventions like

    present?(p) // return bool
    get!(g) // throw if not found
a-la ruby/elixir could be good
i don't know if you are being ironic or not, but the non-english speaking world will love it.
As a non-English programmer who has seen a lot of code not written using English please, god, NO. The mixture of English and other languages identifiers in the external libraries makes me cry.
Yep, writing non english keywords in C++ is like having prolog code inside c++ :-)

(but as a non english person, I find it very helpful to be able to have unicode in strings)

I wish Math used normal english, not greek letters - english is my second language too..
They are latin letters, not english. Unless you mean futhorc. Which would be awesome.
He didn't say "english letters", he said "english" which is a language. The implication is that we should use conventions (including potentially whole words) that are familiar to the English-literate world and not Greek glyphs.
>The implication is that we should use conventions (including potentially whole words) that are familiar to the English-literate world and not Greek glyphs.

The "english-literate" world (emphasis on literate) is, or historically has been, very familiar with Greek glyphs.

And that's just for humanities.

The mathematic and physics -literate world, doubly so. Everybody uses pi, theta, sigma (e.g. the summation formula) etc symbols...

https://en.m.wikipedia.org/wiki/Greek_alphabet

Many of these characters are hard to type on regular keyboards, which is probably what the parent commenter was referring to.

From a non-english speaking country: I prefer US-ASCII for code. Anything else just obfuscates.
There are lots of other languages that support full UTF-8 in identifiers (e.g, Go) and the non-English-speaking world doesn't take advantage.
Every such such language does it insecurely. The only exceptions are Java, rust and cperl.

I rather have no unicode identifiers than insecure identifiers, which don't follow the unicode security guidelines for identifiers.

I've done a similar thing when I was writing a language (with generics) that compiled to Go and needed to implement name mangling. But in any case, this isn't an example of a non-native speaker using utf-8 to write in his native language. :)
Tbh, I mainly shared the example code because I was entertained by the contrast of sophisticated naming and crude approximation. Still, it illustrates that this can come in very handy if used properly.

Of course you can also cause all sorts of mayhem. Even disregarding fun such as GREEK QUESTION MARK looking like a semicolon and zero-width spaces, I probably would not use this feature in enterprise code. Too likely that some tool somewhere (e.g. an alternate compiler?) is KO'd by characters outside of basic ASCII range (which has served us well and will keep doing so).