| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by BuckRogers 3403 days ago

Well, I'm not an expert but in effort for full-disclosure Guido and the CPython core dev team aren't either. They hold a myriad of excuses for their decisions and they're all highly suspect from even a casual observer that doesn't just drink the koolaid. In the end, they'll just tell you they maintain CPython so only their opinion matters. Fair, but they're still wrong. Python3 is controversial for good reasons.[0] It's not lazy folks or whatever ad hominem is out there today. I couldn't tell your intentions given the lack of information included with your post.

Go handles strings by having strings be like Python3's byte-strings and unicode-strings as one type. This enables code to be written that generally doesn't force you to very often think about encodings, which you shouldn't have to as UTF8 is the one true encoding.[1] Or litter your code with encode/decode, or receive exceptions from strings (see Zed's post on some of that) where there wasn't previously. Python3 solved the unicode emojibake mixing unicode and bytes problems that some developers created for themselves in Python2, but did so by forcing the burden to every single Python3 developer, and breaking everyone's code while simultaneously refusing to engineer an interpreter that could run both CPython2/3 bytecode. Which is possible, the Hotspot JVM and .NET CLR prove it. Shifting additional burdens to the developer in situations where it's necessary, makes sense. It wasn't here because of both Python's general abstraction level, and Go showed it can be solved elegantly. Strings are just bytes and they're assumed to be UTF8 encoding. Everyone wins. Only Windows-specific implementations like the original .Net CLR shouldn't be UTF-8 by default, internal and external representation. Only a diehard Windows-centric person would disagree or someone with a legacy implementation (Java, C#, Python3 etc). The CPython3 maintainers fully admit they're leaning towards the Windows way of handling strings.

As you know, handling text/bytes is fairly critical and fundamental. For Python3 to get this wrong with such a late stage breaking change with no effort to make up for it with a unified bytecode VM is unfortunate. Add in the feature soup and the whole thing is a mess.

[0]http://learning-python.com/books/python-changes-2014-plus.ht...

[1]http://utf8everywhere.org/

2 comments

collinmanderson 3403 days ago

Also, it's worth noting that creators of go _invented_ utf8.

link

grzm 3403 days ago

Minor nit: s/emojibake/mojibake/

link