Hacker News new | ask | show | jobs
by r0l1 1050 days ago
We are working with a huge Go and Python codebase and Python is just a pain in terms of using all system resources. We moved many parts to C++ which are called and handled by goroutines. The outcome was a big success. This proposal/change is a big step forward, especially for the deep learning community.
1 comments

> "Python is just a pain in terms of using all system resources"

Whatever this comment means (I honestly can't properly tell) - removing the GIL will have absolutely no impact on Python's resource utilization.

https://peps.python.org/pep-0703/

Quote: "In PyTorch, Python is commonly used to orchestrate ~8 GPUs and ~64 CPU threads, growing to 4k GPUs and 32k CPU threads for big models. While the heavy lifting is done outside of Python, the speed of GPUs makes even just the orchestration in Python not scalable. We often end up with 72 processes in place of one because of the GIL. Logging, debugging, and performance tuning are orders-of-magnitude more difficult in this regime, continuously causing lower developer productivity."

Quote: "We frequently battle issues with the Python GIL at DeepMind. In many of our applications, we would like to run on the order of 50-100 threads per process. However, we often see that even with fewer than 10 threads the GIL becomes the bottleneck. To work around this problem, we sometimes use subprocesses, but in many cases the inter-process communication becomes too big of an overhead. To deal with the GIL, we usually end up translating large parts of our Python codebase into C++. This is undesirable because it makes the code less accessible to researchers."

This requirement could have been well served with a gil per thread and arena based (shared) object allocation model. Every other use case would have been unaffected.

Now we change the world for everyone and put most of library developers through a valley of desperation for 5 years+, just so that a very few narrow use cases get the benefits they want.

Not a smart move IMHO.

Good point. Did the Meta and Deepmind devs really miss this?

I try to avoid python as much as possible, because I mainly work with Go & C++ and multi-threading with those languages is just better (imho). Bringing python a step forward and making it future proof might be a good thing... Even if this means to break some things? Not sure if dismissing the GIL is the right step, but there is a big performance gap to fix. Or maybe the AI community must move to a better suited language? Having python code in production just feels so wrong. Especially if a rewrite in another language shows the performance gap.

The PEP notes subinterpreters as an alternative and says it can be considered a valid approach to achieve paralleism. However it does not discuss why nogil was given preferences. I guess that's ok because the PEP is about nogil.

I'm not sure whether the SC has considered alternative approaches but it would be surprising if not

The use cases of the ML and AI world are very important though, as they massively contribute to Python's popularity. Thanks to Python, researchers and developers don't have to use different languages and library ecosystems for developing and scaling models.

Alas, subinterpreters sound like they could be a feasible solution for many use cases as well.

And they couldn't switch to another language? It sounds really odd to me, too odd to be a justification for this change.