Hacker News new | ask | show | jobs
by chrisseaton 1840 days ago
Java has different semantics to Python. To make Python run fast without a GIL you'd have to change the semantics to be more like Java's, which would be a breaking change, which would mean Python 4.0, which is what they say we're unlikely to get.
2 comments

Why? Which different semantics cause a problem?
Currently Python only allows certain thread interleavings, because those are the points where the GIL is released. If you no longer had the GIL, the set of possible thread interleavings would increase. If your application is not safe with a particular interleaving, it can now break. That's a breaking change, so would require Python 4.

It might be possible to retain current interleaving semantics, through research-level techniques such as transactional memory, but these do not seem likely anytime soon despite a lot of work.

It's also unclear how tractable it is to retain current garbage collection semantics without a GIL, so that could have to change as well with similar issues and similar possible solutions which seem unlikely to land anytime soon.

> Currently Python only allows certain thread interleavings, because those are the points where the GIL is released.

How is this currently avoided? My understanding is that anything that accepts a callback / lambda is potentially subject to interleavings as the code can call a C extension which releases the GIL. Moving code to C extensions is often recommended when Python performance is brought up. Am I misunderstanding something?

> It's also unclear how tractable it is to retain current garbage collection semantics without a GIL

Why is it important to retain current garbage collection semantics? Why does Python prescribe a certain garbage collection implementation and does not allow for others like Java?

The big difference is that Python doesn't allow very much static analysis.

Consider the expression "a.b.c". Where are the function calls? In other words, which of the ".{name}" return a value from some namespace and which call code? (I'm not asking about which functions are called. I'm just asking about where functions are called.)

Java semantics, including the type system and the required declarations, tell you where the function calls are. One consequence is that they're always in the same place.

That's simply not possible in Python because the answer isn't known until the expression is evaluated AND the answer can change every time the expression is evaluated.

Where is the connection to locking and the GIL?
For example, in CPython multiple threads can append to a list without a lock or insert items into a dict without a lock, and the data structure will never become corrupted. If you do the same thing in Java with an ArrayList, there's a chance that some items are lost.

Though it is worth pointing out that Java has a well-defined memory model, and Python doesn't. So the above is true for CPython, but it might not work in other implementations, or, for example, you might have a library providing data structures, and it might not hold for those, either.

> For example, in CPython multiple threads can append to a list without a lock or insert items into a dict without a lock, and the data structure will never become corrupted.

How is this achieved?

> Though it is worth pointing out that Java has a well-defined memory model, and Python doesn't.

Wouldn't you have to at least issue an mfence when a thread enters / exists the GIL?

IronPython, based on CLR and .NET, was faster than Python and didn't have a GIL.
IronPython also wasn't compatible with any python library that used the C API. Essentially we're back to the point that removing the GIL is 'easy' as long as you don't care about backwards compatibility.
in many ways ironpython obviated the C API. It had a different approach to extension, based on CLR and .NET. Even the containers used .NET data types.
Oh absolutely, but that did make IronPython more of a separate language that shared syntax with cpython rather than a drop in replacement for cpython.
But, again, IronPython does not have the same semantics as standard Python, so this would be a breaking change to adopt, and so would need to be Python 4.
IIRC it included a JIT compiler, which was fast enough to offset the overhead of fine-granularity locking.
You have to consider the use case in addition to the technology. The relatively simple implementation of CPython ensures you have fast startup, but trade-off other optimizations which affect long-running programs.

Lots of Python scripts only run for a short time. Code in the REPL needs to return results really quickly. The .NET JIT added significant startup costs to the point where the IronPython team had to add an interpreted mode for first runs before handing off the JIT. This is complex and arguably takes time away from CPython compatibility work.

It seems that the .NET team is coming around to the concerns around startup time, but I don't know if they have landed on an AOT solution. It probably will help executables running on .NET Core more than scripts running under IronPython. I've not kept up with JITs very much, but I don't think this pattern is limited to the .NET JIT, and is a general trade-off. I feel like JavaScript JITs may be closer to what IronPython would have wanted though.