Hacker News new | ask | show | jobs
by josefx 2772 days ago
You forgot one reason not to switch from 2 to 3: Zero benefit. Nothing python 3 does would benefit my scripts to any extend over just using python 2 as I always did. Is learning 3 a huge hurdle? No, it is however in some cases completely unnecessary.
6 comments

The benefit is that you won't be stunting adoption of Python 3, which will help get us back to a homogeneous Python landscape more quickly.
Seeing that there was a vulnerability in python just a couple days ago, I'd consider getting security updates a benefit.

Most people will stop running python 2 when malware targeting it becomes rampant. So people being able to run your code is another benefit.

:o)

Great point. Python 3 for many just wasn't a great carrot. That's why they didn't switch. For a while it was even slower. Also most importantly Python 2 wasn't that bad of a stick. It was just pretty darn good to begin with. And of course that it is still supported on some OS-s people will keep using.
Yeah, for writing scripts like this - python 2: 0% chance of thinking about Unicode. python 3: 5% chance i have to waste time debugging some random str decoding issue, no benefits. why bother?
This might be true if you're an english speaker, running the script on an english platform and only consuming data from english services. And also you are sure that no non-english speaker will ever take over the development of your script and you will never have to localize it to other languages. Otherwise, it's exactly the opposite. Python 3 is what you should be using if you want 0% chance of being stopped by a string encoding issue.

Python 3 might occasionally require some extra steps when consuming strings compared to Python 2, but the reality is that those steps were always necessary. Python 2 just hid those details in a way that was only really safe for english-exclusive development. That doesn't mean that Python 2 is easier to use or less brittle. In fact I would say it means the opposite.

For most strings I don't care about language, encoding or related overhead. In my scripts they are best dealt with as ophaque bytes with a few specific byte patterns that are the same in ascii and utf8, as well as various other encodings.

Last unicode issue I had was on a system german characters, because some library assumed it had to explicitly perform encoding with a bad default setting. If the library didn't try to be smart the program would have worked independently of system or language, instead it failed on any non english system by trying to convert a perfectly fine, system specific encoding to utf8.

Python2 worked fine with Russian input/output.
Why bother? Because Python 2 is EOL in a little less than 2 years.

The NHS chose not to upgrade to from Windows XP and 2003 and look where that got them last year: a massive crypto locker infection.

FWIW Win10 was affected by that same horrific SMB RCE vuln. So the implied argument that 10 has been immune or even much less vulnerable to ransomware over the past year or two is on shaky ground, though I agree it probably will start to have some merit going forward.
I suppose unless someone either forks it or keeps delivering patches outside the Python project. That wasn't really an option for Windows XP, but I'm quite sure if it had been then someone would be doing it.
This is already true -- tauthon is a updated, patched Python 2 fork. https://github.com/naftaliharris/tauthon
Put another way: python 2 has a 0% chance of correctly handling Unicode.
Not actually the case. UTF-8, using only SPECIFIC operations that don't try to split up strings or replace things that aren't exact matches for a given text, will result in valid output as long as there was valid input.

All interchanged Unicode text should be UTF-8, never use another encoding* (without a really compelling reason).

No, storing it as an array of unicode characters isn't a compelling reason during interchange.

ALSO, never use a BOM; that will break things.

The second answer (should be anchor-linked) in this goes over MOST of the advantages of UTF-8, but it doesn't capture that some carefully input operations in otherwise completely Unicode //unaware// 'string' functions result in no change to string validity.

The only potential issue is if recognizing something in different normalization representations is important. However, for nearly all quick and dirty tasks (where a short script is most likely) it usually doesn't matter. For everything else a different paradigm than the one Python3 picked would be better. (One where adding filters to a read file is OPTIONAL and they can be invoked on individual byte-strings as well.)

https://stackoverflow.com/questions/6882301/what-is-the-best...

You must live in the US.
You don't have to learn Python 3, and there are plenty of cool features (ok well being cool isn't always enough reason but it is a reason) and there are libraries that are not supported on 2.x anymore.

Also you don't have so much longer until support drops entirely for 2.x, and then you really should port your projects. Why wait and port? Just write them in 3.

There is nothing wrong with 3. Just starting in 3 is easy and not an issue. It is basically the same language.

This. I've been trying to make this point myself but people don't seem to want to acknowledge it, so it's gratifying to hear someone else say it too.

What if Python 3 was a separate project that someone other than GvR had put forward, would adoption have the same impetus? I don't think so.