Hacker News new | ask | show | jobs
by thomaslee 3303 days ago
If any Python devs are out there reading: my understanding is that removing the GIL itself isn't the hard part so much as removing the GIL while satisfying certain constraints deemed necessary by GvR and/or the rest of the community. I know some of those constraints relate to compatibility with existing C extensions -- but there must be others too?

The reason I ask is Larry's attempt buffered ref counting surely has implications for single-threaded code that maybe relies on the existing semantics -- e.g. a program like this may no longer reliably print "Deallocated!":

  Python 2.7.13 (default, Mar  5 2017, 00:33:10) 
  [GCC 6.3.0 20170205] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> class Foo(object):
  ...     def __del__(self):
  ...             print 'Deallocated!'
  ... 
  >>> foo = Foo()
  >>> foo = None
  Deallocated!
  >>> 
A bad example in some ways since in this particular case we could wait for all ref counting operations to be processed before letting the interpreter exit, but hopefully my point is still clear.

Similarly, what about multi-threaded Python code that isn't written to operate in a GIL-free environment -- absent locks, atomic reads/writes, etc.? At best, you might expect some bad results. At worst, segfaults.

Are these all bridges that need to be crossed once a realistic solution to the core GIL removal issue is proposed? As glad as I am that folks are still thinking hard about this problem, I'm personally sort of pessimistic that the GIL can be killed off without a policy change wrt backward compatibility. Still, I do sort of wonder if some rules of engagement wrt departures from existing semantics might help drive a solution.

3 comments

If I'm understanding you, some or all of these questions are explicitly addressed in the Q&A. My apologies if you got that far and I simply didn't understand you.

For example, your first question seems to be asking about whether there's a semantic change coming from a lack of immediacy in when __del__ will run. And the answer is explicitly "yes, and the docs already told you not to count on that".

As for multi-threaded Python code... and perhaps also multi-threaded C code in extensions... I think the clear answer is "yes, our whole goal is to remove some guarantees that were previously provided, so if you counted on those guarantees you're in trouble". Again, c.f. the Q&A in case that helps.

From the talk, it doesn't look to me like Larry Hastings has a plan for the policy change in question; so maybe "bridges that need to be crossed once [the technical issues are smaller]" is correct?

The big constraint (aside from backwards compatibility) is performance: Guido has indicated that he is unwilling to accept much (if any) slowdown of single-threaded code in order to remove the GIL. It's (relatively) easy to remove the GIL and replace it with a bunch of fine-grained locks (or atomic increments, etc), but doing so tends to slow things down. The challenge is in figuring out how to avoid synchronization overhead for common operations (mainly reference counts).

It's true that buffered refcounting probably means that `__del__` would no longer be called immediately as it is now, but I'm not sure if that's a requirement - pypy and jython don't do this either, and destructors are generally discouraged in favor of `with` blocks these days.

In his talk at last year's pycon, Hastings said the three constraints GvR laid out are:

1. Can't degrade single-threaded performance

2. Can't break existing extensions

3. Can't make the implementation of cpython much more complicated (i.e., can't raise the barrier to entry to participating in the development of python)

All of these are pretty reasonable, if tough, targets to meet, and Hastings agrees with all of them. For 1 and 2 he was generally looking at making GIL-less cpython a compiled mode so that the default was the single threaded version, thus retaining compatibility and performance, but offering a true multi-threaded binary for those who would use it.