|
|
|
|
|
by banthar
3085 days ago
|
|
PEP-393 is a stupid compromise. They couldn't choose between UCS-2 and UCS-4, so they are using both. They are wasting tons of CPU cycles converting between them and single character outside of range doubles the size of string. I don't fully understand the use case for extracting codepoints from strings, but they could have just added Java-like: codePoints and keep returning code units from old methods. This is CPU and memory efficient and 100% backwards compatible. I think the problem is the same could have been done in Python 2 (with UTF-8) that would mean less reasons for Python 3. |
|