|
Sharing objects is completely fine upon a forking multiprocessing pool. Each worker process ends up with a (lazy) copy of the parent address space, including the whole python interpreter state. When you think about it, you already rely on that to access imported modules, global variables, etc in your worker processes. The GIL and reference counting has nothing to care about it either, as they are both copied from the parent process as well, thus you can freely continue to read global state (and even modify, though it will be local to the worker process). That is, if you have large objects to share with your worker processes, and you don't need to get them back, you can freely just assign them as global variables before starting your multiprocessing pool, and have your workers read them, at zero transfer cost, and safely. This _trick_ has been used since decades in Python. Now the annoying thing is, that global variable mumbo jumbo is kind of dirty and ad-hoc to maintain. Using custom pickable weakrefs (using something similar to the trick with id() in this repo), you can create proxy weak objects to transfer to your workers, thus achieving the same effect, while keeping a "function argument" interface, instead of proxy8ng through global variables. |
But also, you say you would prefer such a unbound memory access hack instead of using a global variable?
But also, why does it need to be a global variable? When you fork(), afterwards all the local variables are available to the child process. No need for global variables.