|
|
|
|
|
by hsivonen
3822 days ago
|
|
I see that it's irksome, but as someone who works on the Web Platform, which takes backward compat seriously, I tend to view Python 3 as a mistake. I'm still hoping that they make Python5 that's compatible with Python 2.7 programs but otherwise brings in new features. I'm not holding by breath, though. The saddest thing about Python 3 is that they made a breaking change to do Unicode "right" and still did it wrong. The right way to do Unicode is the way Rust does it: UTF-8 in memory and no (safe) API to introduce UTF-8 invalidity. UTF-32 is wrong, bwcause it's wasteful and still doesn't accomplish what people naively expect due to grapheme clusters potentially taking more than one UTF-32 code unit. |
|
> UTF-32 is wrong, [because] it's wasteful and still doesn't accomplish what people naively expect due to grapheme clusters potentially taking more than one UTF-32 code unit.
Out of curiosity, is there a correct way to encode unicode that doesn't involve this level of surprise? I thought that this was still an unsolved problem at this point.