Hacker News new | ask | show | jobs
by criddell 3823 days ago
> This way people will actually switch.

Is that important?

2 comments

From a software engineering perspective, no. From a software ecosystem architecture perspective, yes.
print as a statement requires a special syntax rule. When people switch that special case goes away. Maybe that does not account for much on the whole, but language designers tend to be irked by such things.
I see that it's irksome, but as someone who works on the Web Platform, which takes backward compat seriously, I tend to view Python 3 as a mistake. I'm still hoping that they make Python5 that's compatible with Python 2.7 programs but otherwise brings in new features. I'm not holding by breath, though.

The saddest thing about Python 3 is that they made a breaking change to do Unicode "right" and still did it wrong. The right way to do Unicode is the way Rust does it: UTF-8 in memory and no (safe) API to introduce UTF-8 invalidity.

UTF-32 is wrong, bwcause it's wasteful and still doesn't accomplish what people naively expect due to grapheme clusters potentially taking more than one UTF-32 code unit.

Python is UTF-8 by default and only upgrades to UTF-16 / 32 when it would make sense to do so given the characters in the string.

> UTF-32 is wrong, [because] it's wasteful and still doesn't accomplish what people naively expect due to grapheme clusters potentially taking more than one UTF-32 code unit.

Out of curiosity, is there a correct way to encode unicode that doesn't involve this level of surprise? I thought that this was still an unsolved problem at this point.

You get people to accept the truth that characters have a variable length in bytes.

Then you offer a data structure that lets you perform O(1) or O(logn) operations on sequences of single-character strings.

If it's read-only you could make it just be an index, blah blah the details don't matter a lot, the point is you can make something that's both correct to grapheme clusters and probably more space-efficient than UTF-32 despite the extra data.

And then the encoding inside the character strings isn't particularly important, but might as well use UTF-8.

-

Either that or make yourself a hilariously inefficient format based on:

UAX15-D3. Stream-Safe Text Format: A Unicode string is said to be in Stream-Safe Text Format if it would not contain any sequences of non-starters longer than 30 characters in length when normalized to NFKD.

Who's with me on 128-byte characters.

People should start calling the hypothesized next version Python 2.great.
As a user, I don't care how irksome the language designer finds these things.
Well, they have to balance (dis)pleasing both the old users and the new users. Not everyone will be happy, even those that may benefit later on have no way of voicing opinion based on the satisfaction that is yet to come.

On the human side, people do not like to stare at glaring mistakes all the time, if they can do something about it -- having weighed the cost associated with the change.

In 20 years if python is still a thing, it likely will not be because it was designed right from the get go. It will be because it changed enough to stay relevant despite the fact that some changes may have left incompatibilities in their wake. Users have to live with that.

You do if it causes language development to stagnate.
It makes the language tricker to learn.
Err, what? Maybe you mean that an arbitrary distinction between statements and expressions makes the language trickier to learn than it otherwise might be. But once you have that distinction, print being a function vs a statement isn't trickier one way or the other.
Print fits much better with python's functions than its statements. I think it is notably tricker if the most common piece of code you pass arguments to has no parentheses, but everything else does.
You could argue the opposite too. If print wasn't changed, a Python tutorial from 10 years ago would have a better chance of working on a modern Python installation.
I certainly don't remember having trouble with that part.