Hacker News new | ask | show | jobs
by masklinn 3275 days ago
> You're right that you can get Python 3 to behave correctly if you jump through some hoops. However, wasn't the point of Python 3 to remove the hoopjumping in the first place?

The hoops of encoding and decoding bytes are not optional, manipulating text and manipulating arbitrary bytes are not the same thing and encoding and decoding is how you translate between the two domains. Python 2 hid this for a subset of the bypes and code was usually broken as a result. Python 3 requires that this split be taken in account in all cases (as do e.g. Java or C#) and is significantly better as a result.

> Anglophone programmers think that internationalisation means "upgrading" from ASCII to Unicode.

Which is a pretty significant upgrade from their previous case of literally not giving a fuck.

> True internationalisation means mechanisms to deal with the world's text encodings in a neutral and culture-agnostic way.

That's completely meaningless word salad.

1 comments

And just to clarify: it's completely meaningless word salad because

1. encoding and decoding has relatively little to do with internationalisation

2. encoding and decoding is no more culture specific than the source encoding is, if you're dealing with culture-specific encodings aside from not doing anything with the content (not even displaying it) you can't be culture-agnostic until after you've decoded the text

3. you can't "neutrally" deal with "the world's text encoding" (whatever that's supposed to mean in your mind) because most of them are not compatible with one another since they use the same binary space for completely different text mappings