|
|
|
|
|
by BuckRogers
3403 days ago
|
|
Well, do your research and come to your own conclusions. Most people are going to agree that UTF8 is the way to go. You can advocate something else, since you seem to take affront to my opposition to Python3's Microsoft-oriented implementation. If you know anything about Swift, it was designed with a primary goal to being a smooth transition from and interop with ObjC so like other legacy implementations (such as CPython3), it had sacrifices that limited how forward-looking it could be. |
|
I want my strings to be strings - that is, an object for which it is guaranteed that enumerating codepoints always succeeds (and, consequently, any operation that requires valid codepoints, like e.g. case-insensitive comparison, also succeeds). This is not the case for "just assume a bytearray is UTF-8 if used in string context", which is why I consider this model broken on the abstraction level. It's kinda similar to languages that don't distinguish between strings and numbers, and interpret any numeric-looking string as number in the appropriate context.
FWIW, the string implementation in Python 3 supports UTF-8 representation. And it could probably be changed to use that as the primary canonical representation, only generating other ones if and when they're requested.