|
|
|
|
|
by mbrubeck
4089 days ago
|
|
In this example the dichotomy is between String (which is guaranteed by the type system to be valid UTF-8) and OsStr (which might be in an unknown encoding or otherwise not decodable to valid Unicode). This is exactly when you want a systems language to require explicit conversions, rather than converting things silently and possibly losing or corrupting data. |
|
Exactly. Python3 went down the "silently converting" route, and it's not pretty[1]. I would go so far as to call it harmful.
http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/
I understand the difficulty in this space; much of it is caused by forcing the Windows unicode filesystem API onto python as its world-view, rather than sticking to the traditional Unix bytes world-view. I'm unixy, so I'm completely biased, but I think adopting the Windows approach is fundamentally broken.