Hacker News new | ask | show | jobs
by rowanG077 406 days ago
Afaik the only guarantee there is, is that a bytecode instruction is atomic. Built in data structures are mostly safe I think on a per operation level. But combining them is not. I think by default every few millisecond the interpreter checks for other threads to run even if there is no IO or async actions. See `sys.getswitchinterval()`
2 comments

Bytecode instructions have never been atomic in Python's past. It was always possible for the GIL to be temporarily released, then reacquired, in the middle of operations implemented in C. This happens because C code is often manipulating the reference count of Python objects, e.g. via the `Py_DECREF` macro. But when a reference count reaches 0, this might run a `__del__` function implemented in Python, which means the "between bytecode instructions" thread switch can happen inside that reference-counting-operation. That's a lot of possible places!

Even more fun: allocating memory could trigger Python's garbage collector which would also run `__del_-` functions. So every allocation was also a possible (but rare) thread switch.

The GIL was only ever intended to protect Python's internal state (esp. the reference counts themselves); any extension modules assuming that their own state would also be protected were likely already mistaken.

Well I didn't think of this myself. It's literally what the python official doc says:

> A global interpreter lock (GIL) is used internally to ensure that only one thread runs in the Python VM at a time. In general, Python offers to switch among threads only between bytecode instructions; how frequently it switches can be set via sys.setswitchinterval(). Each bytecode instruction and therefore all the C implementation code reached from each instruction is therefore atomic from the point of view of a Python program.

https://docs.python.org/3/faq/library.html#what-kinds-of-glo...

If this is not the case please let the official python team know their documentation is wrong. It indeed does state that if Py_DECREF is invoked the bets are off. But a ton of operations never do that.

This is the nugget of information I was hoping for. So indeed even GIL threaded code today can suffer from concurrency bugs (more so than many people here seem to think).