Hacker News new | ask | show | jobs
by ProblemFactory 3436 days ago
> I did not propose a solution because there are many, as you note; there are, however, implementations with decent solutions like AFAIK Jython.

There are no solutions that satisfy everyone that I am aware of yet. Guido has said in the past that he'd be happy to get rid of the GIL, and would merge a patch that solves it, as long as:

* It does not reduce the performance of single-threaded Python code.

* It stays compatible with all existing pure Python code and C extensions.

But in practice, GIL is not that much of an issue for many types of applications where Python is popular.

* It's not an issue for web apps, because these are typically served from multiple physical servers each running multiple python processes. These do not share a GIL anyway, and "thread safety" is pushed to database transactions.

* It is not an issue for apps which spend most of the time doing I/O. Most IO libraries release the GIL, and other threads can run while you're waiting for results from the database.

* It is not an issue for data science doing heavy number crunching with numpy and everything built on top of numpy. Numpy releases the GIL while doing large computations in C.

* It is not an issue for small scripts, as a "better than Bash".

The GIL is only an issue for apps that do heavy computation in pure Python code, and need parallelism within a single process (socket servers? text data processing?). As a result, many Python users just don't find it a big enough problem to be worth solving, if the solution comes with downsides for their use cases.

2 comments

The GIL is only a problem because there's "no free lunch" -- no single strategy that is best in all cases.
> It is not an issue for apps which spend most of the time doing I/O.

This is a common misconception that doesn't seem to be backed up by any data.

Dave Beazley did a number of performance tests with profiling, looking at GIL contention in a multi-core scenario: http://www.dabeaz.com/python/GIL.pdf

The results were that even IO-bound workloads still suffered because of the poor implementation of the GIL (details on slide 35 or so). This was an issue up until Python 3.2 (!) when a new GIL implementation was added, which he also profiled: http://www.dabeaz.com/python/NewGIL.pdf

Not backed up by data? Maybe because it's so easy to document that no one bothers to write about it?

Multithreaded IO-bound tasks don't care about the GIL.

Yeah the old implementations of the language were not as good as the latest. It doesn't seem right to criticize the language for problems that have already been fixed.

"IO-bound multithreading is fine" has been the Python mantra for the last 20 years. Lo and behold, someone actually gathers some data and comes to find out that is absolutely wrong. For the last few years they've had a revamped version of the GIL, but that still has a burden of proof that can only be validated by profiling real-world applications.

A community can't make flat-out invalid claims for two decades and then expect everyone to take them at their word that everything is fine now.