|
|
|
|
|
by speedplane
1890 days ago
|
|
> It is often best to avoid non-ASCII characters in source code. ...
>> Depends of the language. In Python 3, files are expected to be utf8 by default, and you can change that by adding a "# coding: <charset>" header. It's interesting that many languages avoid unicode and non-ASCII text, yet they make assumptions about file and directory structures about the underlying system. It's as if interpreting directory and file system structures is "okay", but interpreting file formats is not. > In fact, it's one of the reasons it was a breaking release in the first place, and being able to put non-ASCII characters in strings and comments in my source code are a huge plus. Sorry, but as a Python dev that went from 2 to 3, yes native unicode features are nice, but no, it was not worth breaking two decades of existing code. |
|
Up to 2015, my laptop was plagued with Python software (and others) crashing because they didn't handle unicode properly. Reading from my "Vidéos" directory, entering my first name, dealing with the "février" month in date parsing...
For the USA, things just work most of the time. For us, it's a nightmare. It's even worse because most software is produced by English speaking people that have your attitude, and are completely oblivious about the crashing bugs they introduce on a daily basis.
In Asia it's even more terrible.
And I've heard the people saying you can perfectly code unicode aware software in Python 2. Yes, you can, just like you can code memory safe code in C.
In fact, just teaching Python 2 is painful in Europe. The student write their name in a comment ? Crashes if you forget the encoding header. Using raw_input() with unicode to ask a question? Crashes. Reading bytes from an image file ? Get back a string object. Got garbage bytes in string object ? Silently concatenate with a valid string and produce garbage.