Hacker News new | ask | show | jobs
by sherjilozair 1713 days ago
You know it’s trivial to add a global lock to any concurrent program, right? What would you lose if other people started writing performant code?
1 comments

This comment under the article gives an example: https://lwn.net/Articles/872961/. Several other comments there also discuss this problem.

Unless an extension explicitly releases the GIL it's not possible for the state of the interpreter to change during execution of its methods. That's an invariant that extensions rely on implicitly for safety in many ways and it's hard to imagine how one could make them safe without significant work on all of those extensions. I, for one, own extensions that would require complex, structural, performance-affecting changes.

And it's worth noting that "safety" here is not just safety from incorrect behaviour, it's safety from memory corruption, crashes and security issues.

Edit: Also, just to note - there is nowhere extension authors can add a global lock that would solve this problem. It would require top-level python programs to add the necessary locking, and the consequences of them not doing so would typically include crashes and severe security issues. The only place it's "trivial" to add a global lock to avoid these problems is the interpreter. A Global Interpreter Lock, if you will.

But, of course, if you think the GIL can be removed in such a way that these issues aren't a real problem, have at it. Plenty of people will thank you.

Edit 2: Also worth explicitly mentioning: when it comes to avoiding memory corruption, extension authors can't make any assumptions about what their python callers will do. I (and any responsible extension author) go to significant lengths to ensure my extensions can't crash regardless of how they're used from python.

One of my extensions, for example, is a (private, in-house) interop mechanism that allows python users to access an API developed in C#. If the GIL is removed and somebody goes and writes a bit of threaded python code that modifies the contents of some object while my extension is accessing it, without the necessary locking, and this results in memory corruption, the blame will rightly fall on my extension. Python isn't C, and the people writing it (unless they're using ctypes or whatever) don't expect to be able to cause memory corruption by making elementary programming errors.

Is there a workaround here where that locking can be applied automatically from above when an extension hasn’t explicitly said it’s GIL-free safe?
Yes, but the locking inherently has to be global so it would have to be effectively the actual GIL if needed. The interpreter would therefore have to have a gil/nogil mode, maybe switched by a command-line parameter (you wouldn't want gil mode to be implicitly enabled interpreter-wide just because you imported a particular module). That's certainly possible, but I doubt it would be popular.
I guess that’s what I mean, yeah. For people that just don’t need these extensions, you could run without the GIL and then if it was required you could start using it while interacting with unsupported extensions.

I get what you’re saying, but for a lot of us, just being able to do nice multithreading for io would be a great enhancement. We have a use case when we’d like a bunch of threads to search through a large numpy structure and at the moment we have to stop to multiprocessing, which works. But is really heavy.