| From my experience working on an older Python codebase, this issue is definitely a headache. It's extremely difficult to gradually adopt typing in an older Python codebase with almost no typing information because the only real "enforcement" option seems to be a CI pipeline running something like `mypy`. This issue compounds in a painful way. Because 99% of your codebase is starting out untyped, you have a couple of options, neither of which I have found to be practically very useful. For the first option, you can run a blanket `mypy` invocation on your entire codebase and have a massive blast of errors you ignore for some time. Because it's necessarily going to error in the beginning, you can't really fail your CI pipeline as a result of this yet. If you can convince your team to gradually improve typing or set a deadline for eventual CI failure based on types, you might be able to move the needle and eventually get your codebase typed. From my experience though, this basically just became a CI step everyone ignored, and for people on the team not passionate about typing, they never worried about it. The other option, which is far more annoying in practice, is to pick a few "seed" files in your codebase that you can add typing information to quickly. Then, supplement your `mypy` invocation with a list of these seed files so it becomes something like `mypy file1.py file2.py ...` As you continue to improve typing "at the edges" of your codebase, you gradually add more and more files to the `mypy` invocation until you're eventually (hopefully) adding entire subfolders, and then maybe eventually the entire codebase. Starting at the edges means you can enforce the CI check from the beginning and get value quickly. The issue here is mostly remembering to continue to add files to the `mypy` invocation, which means you're constantly altering your CI pipeline. You know how when you alter a CI command it breaks sometimes because you got the encantation slightly wrong? Multiply this effect across basically every member of your team 1x a week because most people probably haven't edited your CI pipeline before. With even a small team (~7-10) making constant changes to a codebase, this quickly becomes extremely painful, and pipeline failures start eating a significant chunk of time just trying to debug if the encantation is wrong or if the types are actually broken. We mitigated this by having only one dev add new files to the `mypy` invocation, which worked well for the CI side of the story. The local side of the story is what ultimately led to enough fatigue to give up. It was hard to get in the rhythm of using local `mypy ...` invocations to check your types as you made changes, and so the experience for most of our team was to push changes, and then the types would break in CI, which was frustrating. They'd go in and try to fix it, and sometimes Python typing gets weird, and a fix wasn't immediately obvious. Eventually you get to `#type: ignore` or `Any`s being thrown around to sidestep the CI pipeline, and your typing story has collapsed again. The real kicker for us was the painful juxtaposition between `mypy` and `import`s. Is the giant swath of errors I'm seeing from this file or from a file I imported? Asking the entire team to become Python typing gurus to sort out these issues was a non-starter. Does anyone have experience gradually adopting Python typing in a large, older codebase successfully? If so, would you mind sharing the methodology you found success with? |
Rather than doing this, which does indeed seem like a headache, it may make more sense to skip import following at the very beginning until your core is typed so you can still enforce typing on the leaf nodes moving forward.
> Eventually you get to `#type: ignore` or `Any`s being thrown around to sidestep the CI pipeline, and your typing story has collapsed again
While there are some cases where this is truly the best option, ultimately you get to the point where you just don't allow this, otherwise what's the point of all the effort?
> and for people on the team not passionate about typing, they never worried about it <...> Asking the entire team to become Python typing gurus to sort out these issues was a non-starter.
The faster the core can be typed (and typed correctly), the easier it becomes for those who are less passionate. Presumably someone has done the calculus to determine that this effort is worthwhile, so while the team doesn't necessarily all have to reach guru level, they need to be convinced to continue the work. Removing barriers is huge for this, since as you've noticed once it starts being easy to ignore it's really challenging to stop ignoring.