| I've been building a program that heavily uses multiprocessing for the past few months. It works quite well, but it did take me a little bit to figure out the best way to work with it. > - Threads can't be used efficiently because of the GIL Python's "threads" are actually fibers. Once you shift your thought process toward that then its easy enough to work with them. Async is a better solution, though, because "threads" aren't smart when switching between themselves. Async makes concurrency smart. But if you want to use real threads, multiprocessing's "processes" are actually system threads. > - multiprocesses has to serialize everything in a single thread often killing performance. (Unless you use shared memory space techniques, but that's less than ideal compared to threads) I'm not quite sure what you mean. Multiprocessing's processes have their own GIL and are "single-threaded", but you can still spawn fibers and more processes from them, as well as use async. Or are you talking about using the Manager and namespaces to communicate between processes? That is a little slow, yes. High speed code should probably use something else. Most programs will be fine with it, but it is way slower than rolling your own solution. However, it does work easily, so that's something to be said about it. Shared memory space techniques do work, too, but they are a little obtuse. Personally, I rolled my own data structures using the multiprocessing primitives. You have to set them up ahead of time, but they're insanely fast. Or you can use redis pubsub for IPC. Or write to a memory-mapped file. - You can't use multiprocess while inside a multiprocess executor. This makes building things on top of frameworks/libs that use multiprocess a nightmare... e.g try to use a web server like over something like Keras... I'm not sure what you mean. Multiprocessing simply spawns other Python processes. You can spawn processes from processes, so I don't know why you would have issues. Perhaps communication is an issue? > - The dependency ecosystem is a pita Yes, absolutely. |
They’re actually not. They are native threads with high lock contention.
Async is arguably fibers, as are greenthreads in libraries like gevent or eventlet.
> But if you want to use real threads, multiprocessing's "processes" are actually system threads.
They’re system threads running in separate memory spaces. Also known as… processes.