Hacker News new | ask | show | jobs
by jdormit 2296 days ago
> It's not hard to upgrade your code

People keep writing blog posts about how easy it is to migrate your code, and I just can't imagine that these people have any experience maintaining a Python2 codebase outside of toy projects or libraries.

At work I maintain a Python2 codebase that's been under development since 2016. It's huge, relies on now-deprecated Py2-only libraries that we can't realistically switch away from, full of Py2-specific syntax, and reliant on Py2's handling of Unicode strings. We scoped out what it would be like to upgrade it to Py3, and determined that we would basically need to rewrite large parts of it from scratch. This isn't an acceptable trade-off for the business in terms of development time and priorities.

Instead we are slowly breaking out the monolith's functionality into Py3 services, which is much more tightly-scoped (and more palatable for management). But realistically speaking we will be maintaining Py2 code for the foreseeable future, and I can't imagine we are the only company in this situation.

Upgrading a production codebase from Py2 to Py3 isn't easy, or even possible in some cases. And frankly, all these blog posts suggesting that migration is trivial are insulting.

4 comments

95% of the migration effort is trivial (I say this as someone who has migrated 100k+ loc myself as part of a much larger migration effort).

There's things that are tricky, don't get me wrong: unicode handling can be tricky, though six.ensure_str/bytes usually allows you to defer that problem. Futurize and 2to3 do so much of the work that in my experience the majority of files can be migrated by automation, unicode warts included.

The hard stuff is when you have extension modules or weird metaprogramming stuff (there's also one particular case I had to deal with around code generation). That takes effort to migrate, but it's the long tail.

If you need to rewrite your code to enable the py3 transition, the issue isn't python, it's your coding choices, and you should rewrite your code anyway.

> If you need to rewrite your code to enable the py3 transition, the issue isn't python, it's your coding choices, and you should rewrite your code anyway.

I... don't even know what to make of this advice. Have you ever worked for an engineering organization with more than 5 members? I can't just go to my boss and say, "I need the next six months to rewrite our whole backend". I'd get laughed out of the room.

Without a clear, business-first reason for an engineering project of that magnitude, it's not ever going to be a priority or a realistic option.

I migrated a production (Django 1.10 + Django REST Framework + lots of custom code) API backend at a major publisher that handles over 500 requests per second from Python 2.7 to 3.7 this year. It wasn't that hard, even without 2to3 or six – mostly just chasing down string vars that now needed `.encode('utf-8')` on their way out the door. We saw an immediate 30% performance increase, so management has been real happy with the effort.
> I... don't even know what to make of this advice. Have you ever worked for an engineering organization with more than 5 members?

Yes.

> Without a clear, business-first reason for an engineering project of that magnitude, it's not ever going to be a priority or a realistic option.

I'm not suggesting that there is any realistic solution. I'm simply stating that the cause of your pain isn't py2 to 3, but bad development practices that existed independent of language or migration. These were going to cause you pain at some point no matter what. It just happened that py3 was the forcing function, as opposed to something else next year.

While I won't deny that there were some unfortunate design decisions made in the codebase (some very questionable choices were made at some point), I think that's inevitable in any codebase of a certain size and age that's been worked on by enough people in a company that's changed substantially a number of times.

But that's not the cause of my pain. Bad code can and will be written in Py2, Py3, or any other language. The cause of my pain is that the language developers chose to abandon work on the language that we use in favor of a different language, one that is fundamentally different in some important ways. As a result, the tooling and ecosystem on which we've built our product is slowly stagnating and is getting close to being fully disfunctional.

If you have bad code you'll eventually feel the cost of that bad code, if the code lives for any significant length of time. Something will make you feel that pain, either a new feature you need to develop or a change to the ecosystem, or something else entirely.

If not py3, then something else. Specifically, the tooling ad ecosystem isn't dysfunctional, you're just unable to maintain functionality among changing requirements. "We must use py3 for our codebase" isn't particularly different from "we must now support Japanese users" or "we need a new endpoint that allows us to view data along a new axis". All 3 changes had the potential to cause you developmental pain. You just hit one prior to the others.

>> The cause of my pain is that the language developers chose to abandon work on the language that we use in favor of a different language

Oh now language devs are responsible for your lack of industry practices?

100k is a toy project compared to how many LoC a company that has been around the block for a decade or more has.
Like I said, I personally did 100K, much of that was in the long tail of things that couldn't be automated for some reason or another. The entire migration was significantly larger and highly automated.
If the code base started its development in 2016, why did they choose python 2 over python 3?
I wasn't at the company at that time, so I don't really know. Sure would have made my life easier if they had...

EDIT: Just checked, looks like the first commit was in 2015. But Py3 was still an option back then :(

We have a similar situation - we had a programmer working in a very silo'd manner, writing lots of Python2 code back in 2013/2014. He's since left the company, and while I'm relatively comfortable with Python, there is now a lot of residual Python2 code that will have to be updated to Python3 or rewritten in another language before our next refresh.

By contrast, some of the Perl6 code we have hanging around for various tasks has aged much better.

Python2 in 2015 is/was defensible. There were still a lot of libraries that did not have support for python3 at that time. The inflection point was probably around 2016.
> People keep writing blog posts about how easy it is to migrate your code

As someone who has had to migrate from Py2 to Py3, it wasn't easy. Unicode/bytes/strings was the hardest aspect of it -- because Python is dynamic, there were breakages that unit tests couldn't exhaustively test for so things broke in production and we had to scramble to fix.

Other aspects were mostly automatically handled by the 2to3 tool.

It seems like the main problem is the network effect from all the libraries. It'd probably be much more feasible to upgrade your code alone than to upgrade your code and also find/build replacements for all of your dependencies.

If so, maybe a more concerted effort needs to be made around getting those upstream projects to upgrade. If some of them are refusing/dragging their feet, maybe the core python team could even get involved to help smooth the transition for everybody.