|
|
|
|
|
by qsort
1890 days ago
|
|
> As somebody living in Europe, I think it's a perspective you can have only if you live mostly in an English speaking world. I live in Europe and I (mostly) agree that (most) code shouldn't (usually) contain any codepoint greater than 127.
It's a simple matter of reducing the surface area of possible problems. Code needs to work across machines and platforms, and it's basically guaranteed that someone, somewhere is going to screw up the encoding. I know it shouldn't happen, but it will happen, and ASCII mostly works around that problem.
Another issue is readability. I know ASCII by heart, but I can't memorize Unicode. If identifiers were allowed to contain random characters, it would make my job harder for basically no reason.
Furthermore, the entire ASCII charset (or at least the subset of ASCII that's relevant for source code) can easily be typed from a keyboard. Almost everyone I know uses the US layout for programming because European ones frankly just suck, and that means typing accents, diacritics, emoji or what have you is harder, for no real discernible reason. String handling is a different issue, and I agree that native functions for a modern language should be able to handle utf8 without crashing. The above only applies to the literal content of source file. |
|
Otherwise, chinese coders can't put their name in comments? Or do they need to invent an ascii names for themself, like we forced them to do in the past for immigration ?
And no hard coded value for any string in scripts then? I mean, names, cities, user messages and label, all in in config files even for the smallest quick and dirty script because you can't type "Joyeux Noël" in a string in your file? Can't hardcode my "Téléchargements" folder when in doing some batch work in my shell?
Do I have write all my monetary symbols with escape sequences? Having a table of constant filled with EURO_SIGN=b'\xe2\x82\xac' instead of using "€"? Same for emoji?
I disagree, and I'm certainly glad that not only Python3 allows this, but do it in a way that is actually stable and safe to do. I rarely have encoding issues nowadays, and when I do, it's always because something outside is doing something fishy.
Utf8 in my code file, yes please.