| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ubernostrum 3208 days ago
	This would be the case in Python 2, where source code files are assumed to be ASCII-encoded unless there's an encoding comment at the top of the file. In Python 3, source code files are assumed to be UTF-8.

2 comments

ninkendo 3207 days ago

Interesting that Python 2 couldn't fix that in a hotfix/point release... UTF-8 is backwards compatible with ASCII so it shouldn't break anything if source started being interpreted as UTF8. I'd be curious to see what their reasoning is.

link

dguaraglia 3207 days ago

I would imagine Python's approach to introducing new language features had a lot to do with it. Having to go through the PEP system takes some time, and changes like these tend to be reserved for minor-version releases. All in all, I love the PEP system, it's such an open concept and I've been surprised by the amount of quality proposals that get implemented. Wish Go had something like it.

link

ubernostrum 3207 days ago

The change to UTF-8 source encoding also changed the legal set of characters for identifiers, and specified how to normalize them. Which in turn is the reason behind this thing I posted on Twitter a while back:

https://gist.github.com/ubernostrum/b7b705bf21b86a1b5c1e2c9f...

And also is a big enough change to not really be something that could happen in Python 2.

link

dguaraglia 3207 days ago

Correct, this was a codebase that still had some Pylons (gasp! Not even Pyramid, but legit Pylons) code.

link