> What is the real world benefit we will get in return?
If you have many CPU cores and an embarrassingly parallel algorithm, multi-threaded Python can now approach the performance of a single-threaded compiled language.
The question really is if one couldn't make multiprocess better instead of multithreaded. I did a ton of MPI work with python ten years ago already.
What's more I am now seeing in Julia that multithreading doesn't scale to larger core counts (like 128) due to the garbage collector. I had to revert to multithreaded again.
Well I once had an analytics/statistics tool that regularly chewed through a couple GBs of CSV files. After enough features had been added it took almost 5 minutes per run which got really annoying.
It took me less than an hour to add multiprocessing to analyze each file in its own process and merge the results together at the end. The runtime dropped to a couple seconds on my 24 thread machine.
It really was much easier than expected. Rewriting it in C++ would have probably taken a week.
let results = files |> Array.Parralel.map processFile
Literally that easy.
Earlier this week, I used a ProcessPoolExecutor to run some things in their own process. I needed a bare minimum of synchronization, so I needed a queue. Well, multiprocessing has its own queue. But that queue is not joinable. So I chose the multiprocessing JoinableQueue. Well, it turns out that that queue can't be used across processes. For that, you need to get a queue from the launching process' manager. That Queue is the regular Python queue.
It is a gigantic mess. And yes, asyncio also has its own queue class. So in Python, you literally have a half a dozen or so queue classes that are all incompatible, have different interfaces, and have different limitations that are rarely documented.
That's just one highlight of the mess between threading, asyncio, and multiprocessing.
That's not really correct. Python is by far the slowest mainstream language. It is embarrassingly slow. Further more, several mainstream compiled languages are already multicore compatible and have been for decades. So comparing against a single-threaded language or program doesn't make sense.
All this really means is that Python catches up on decades old language design.
However, it simply adds yet another design input. Python's threading, multiprocessing, and asyncio paradigms were all developed to get around the limitations of Python's performance issues and the lack of support for multicore. So my question is, how does this change affect the decision tree for selecting which paradigm(s) to use?
> Python's threading, multiprocessing, and asyncio paradigms were all developed to get around the limitations of Python's performance issues and the lack of support for multicore.
Threading is literally just Python's multithreading support, using standard OS threads, and async exists for the same reason it exists in a bunch of languages without even a GIL: OS threads have overhead, multiplexing IO-bound work over OS threads is useful.
Only multiprocessing can be construed as having been developed to get around the GIL.
No, asyncio's implementation exists because threading in Python has huge overhead for switching between threads and because threads don't use more than one core. So asyncio was introduced as a single threaded solution specifically for only network-based IO.
In any other language, async is implemented on top of the threading model, both because the threading model is more efficient than Python's and because it actually supports multiple cores.
Multiprocessing isn't needed in other languages because, again, their threading models support multiple cores.
So the three, relatively incompatible paradigms of asyncio, threading, and multiprocessing specifically in Python are indeed separate attempts to account for Python's poor design. Other languages do not have this embedded complexity.
> In any other language, async is implemented on top of the threading model
There are a lot of other languages. Javascript for example is a pretty popular language where async on a single threaded event loop has been the model since the beginning.
Async is useful even if you don't have an interpreter that introduces contention on a single "global interpreter lock." Just look at all the languages without this constraint that still work to implement async more naturally than just using callbacks.
Threads in Python are very useful even without removing the gil (performance critical sections have been written as extension modules for a long time, and often release the gil).
> are indeed separate attempts to account for Python's poor design
They all have tradeoffs. There are warts, but as designed it fits a particular use case very well.
Calling Python's design "poor" is hubris.
> So my question is, how does this change affect the decision tree for selecting which paradigm(s) to use?
The only effect I can see is that it reduces the chances that you'll reach for multiprocessing, unless you're using it with a process pool spread across multiple machines (so they can't share address space anyway)
Not in the least. Python is a poorly designed language by many accounts. Despite being the most popular language in the world, what language has it significantly influenced? None of note.
> Python is a poorly designed language by many accounts
Hubris isn't rare.
> what language has it significantly influenced?
I can think of at least 1 language designer[1] who doesn't think it's "poorly designed," based on it's significant impact on what they're currently working on[2]
Who cares about how many other languages a language has influenced? If that was a metric of any consideration we all would write Algol or something. Programming languages are tools, tools to help you perform a task.
>Python is by far the slowest mainstream language. It is embarrassingly slow.
Oh? It is by far the fastest language for me. No languages comes close on the time from starting to write, to have code that runs. For me that time far outweighs the execution time, so it is a lot more important.
What's more I am now seeing in Julia that multithreading doesn't scale to larger core counts (like 128) due to the garbage collector. I had to revert to multithreaded again.