Hacker News new | ask | show | jobs
by nine_k 2335 days ago
Problems of Python 2 to 3 migration were mostly not about clever tricks. They were largely about making Unicode strings and byte strings incompatible (as they should have been from get go). Much of Python 2 code mixed them up, and that was a source of actual bugs. Hence the need to fix manually.
1 comments

It's hard to overstate how big of a breaking change that was. Python is basically scriptable C. In c, strings aren't really a thing, but they are first class in Python (since everything is effectively a dict). And char's are your typical string-like data structure. So in py2, it made (some) sense at the time to let str and byte[] types intermingle. I don't agree with that choice, but it wasn't unreasonable (much like the null pointer).

This led to all manner of playing fast and loose with str as byte[] usage. I've seen inline asm and even machine code in python.

Now it's the new millenium and oh look, ascii-char won't cut it as your language implementation of strings.