Hacker News new | ask | show | jobs
by wpollock 79 days ago
Allowing Unicode characters, then stating best practice is to stick with ASCII, is weird. (Go is not alone in this practice.) Unicode identifiers have a host of issues, such as some characters have no case distinction, some have title-case but not uppercase, some "capitalize" the last letter in a word and not the first (Hebrew has five "final form" letters), etc. Does Go specify the meaning (exported or not) if a letter has no case, or if an identifier starts with a zero-width joiner character? Without a huge list of detailled rules, too much is left to the implementation to decide. I prefer to stick with ASCII for names.

Fun fact: When printing with movable type began, printers would travel with large "type cases" containing the small wood or metal blocks with glyphs on them. The ones the used frequently were kept in the lower half of the case, in easy reach. That's where the terms "lowercase" and "uppercase" come from.

2 comments

It’s bad to disallow non-ASCII characters when programming for non-English business domains. The domain vocabulary often doesn’t translate well to English, or you need a glossary for someone familiar with the business domain to know which English term is supposed to mean which native business term. It’s just a pain. Conversely, transliterating the native term in ASCII can introduce ambiguities or be awkward or just plain weird for the native speaker.

Of course, Unicode can be abused, but ASCII isn’t completely free of that either. (Maybe it tickles your fancy to name a variable _o0O0o_ or l1lIl1lIl1lIl1l.) If your native script doesn’t have upper and lower case, compromises may have to be made. We have to trust programmers to use good judgement.

Anecdotally I'm programming for non-English business domains in Go and Python and I've literally never seen anyone use native alphabet in identifiers - it's always either poor translations or transliterations.
I believe https://imgur.com/dN9Nz3h is the canonical example of why full Unicode support is maybe not desirable.
because ths is so much better:

    class pppp {
        func ppp(_ c: Int, m: Int) -> Int {
            return c + m
        }
    }
    
    var b = 3
    var s = b + 2
    
    var p = pppp()
    print(p.ppp(b, m: s))
it is weird, especially for Go with its semantic naming and famously opinionated compiler. it will gladly build code with a variable named 𖤐界ᥱᥲΣ੭, but God forbid it's unused.