Hacker News new | ask | show | jobs
Python and Async Simplified (2018) (aeracode.org)
73 points by dil8 1336 days ago
7 comments

It gives good pointers but it falls short on the usual suspects for an article on asyncio.

When teaching it, it's important to emphasis:

- await is locally blocking, so you should isolate linear workflows into their own coro, which is the unit of concurrency.

- to allow concurrency, you should use asyncio.create_task on coro (formerly ensure_future).

- you should always explicitly delimitate the life cycle of any task. Right now, this means using something like gather() or wait(). TaskGroup will help when it becomes mainstream.

A HN comment is not great to explain that, but if you read the article, you should investigate those points. There is no good asyncio code without them, only pain and disapointment.

> ... TaskGroup will help when it becomes mainstream.

Strongly agreed, but you can use anyio [1] in to of asyncio to get that functionality right now. Or, maybe even better, use Trio [2] instead, which is where the idea came from in the first place.

[1] https://anyio.readthedocs.io/en/stable/

[2] https://trio.readthedocs.io/en/stable/

> to allow concurrency, you should use asyncio.create_task on coro (formerly ensure_future).

This is misleading... you can use asyncio.gather which does this internally [0].

[0]: https://github.com/python/cpython/blob/main/Lib/asyncio/task...

Only if you wish to collect tasks where you schedule them, accumulate the results and don't need to limit concurrency.
Serious question, why would you limit concurrency in async world? You’re still on a single thread, why would you want to only schedule n things at a time?
> you should always explicitly delimitate the life cycle of any task

Unless you want a hacky actor system, in which case it's totally fine to `create_task` a ton of corountines which have their own spin loop with await sleep :)

Even if you want to ‘fire and forget’, it’s still essential to keep a reference to the task, otherwise it can be garbage collected mid-execution:

https://docs.python.org/3/library/asyncio-task.html#asyncio....

Wow! Did not know this, guess I’ve got a couple fixes to make…
One coroutine crashing and the others continuing to send it messages without noticing was my $40,000 bug.
I’m so confused by this architecture. It makes total sense in a threaded world but why would you want a coroutine constantly scheduling itself in a loop to pull messages off a queue like thing than just having the thing generating the message fire off a task to process it directly right there? It feels almost the same to me and then you can’t crash the coroutine.
At some point even those tasks must be cleanly stopped and unless you want to play erlang and "let it crash", the actors have a lifecycle as well. Making it explicit will avoid much pain, and ease testing a lot. Also it will make resources consumption more predictable.
This is the bit that should be at the very top of the official docs. It's tripped me up every time I go to write async code and until you learn it, the error message is very confusing.

> In particular, calling it will immediately return a coroutine object, which basically says "I can run the coroutine with the arguments you called with and return a result when you await me".

> The code in the target function isn't called yet - this is merely a promise that the code will run and you'll get a result back, but you need to give it to the event loop to do that.

If I try to pass the async function to gather (for example) without calling it, which makes some intuitive sense, since functions are first class objects and I know I'm not calling it, the event loop is, the error message reads something like, "gather only accepts coroutines." But I thought it was a coroutine because I declared it with async! For some reason it took me a silly amount of time to notice that in all the examples, the async function is called when it's passed to gather (or whatever). That's not intuitive to me and the distinction made in the article should be clearer in the docs.

> If I try to pass the async function to gather (for example) without calling it, which makes some intuitive sense

That intuition breaks immediately when you realize that those functions can have arguments, and you have no way to pass them.

Similar question to the other one at the time of writing, but more specific: does anyone have a good, thorough introduction to the "async event loop" (sometimes known as "asyncio") pattern? By thorough I mean that it goes beyond a starter tutorial, into both examples of various supporting libraries and implementation details that matter for usage. I'm fine with a book, too.

There are popular libraries for it in both Python and Perl and I suspect I could make good use for it if I understood it.

Unfortunately, I've only ever used it in a cargo cult manner of sticking together functions until the error messages go away (yeah yeah, it was only for "throwaway" "prototypes") so I really don't understand how it all is meant to fit together.

I found this post to be amazing intro that shows you how to go from simple generators to async event loop.

https://mleue.com/posts/yield-to-async-await/

This was very good! But true to what's common, it stops just as it gets really juicy!
I thought Python couldn't multithread because of GIL? I understood from the article that async derives all its benefit from certain OS-level operations which don't need to run in a coroutine, like reading from a network socket or waiting for timers to finish.

Another question: Is Python's implementation of async/await identical to other languages? In particular, do they always use coroutines instead of threads?

since it's my job to clear these things up, a few pointers:

1. python has threads. they just cannot perform CPU bound tasks in parallel due to the GIL. The GIL is released for IO, so threads can perform IO waiting in parallel, just like asyncio 2. asyncio runs in one thread, and has the exact same limitations as threads as implemented in Python, CPU operations are serialized, async tasks can yield for IO.

the advantages offered by asyncio are: 1. you can have thousands of tasks extremely quickly cheaply, which is not as much the case for threads in Python . this can allow for massive concurrent architectures more expediently, provided your concurrency is very IO bound (if you are CPU bound, disaster) 2. people just like asyncio's programming model, IMPO this is largely due to the popularity of Javascript's event-based model being natural for lots of newer programmers

I've been coding Python since 2.5 days and I have yet to have a use case where I've really needed asyncio. For client-side code, concurrent.futures (specifically ThreadPoolExecutor) has satisfied nearly every use case, though occasionally I'll use a a worker-thread model.

For server-side code, I'd still probably use threads up to maybe 1000 concurrent connections. Beyond that, I've used gevent to good effect. e.g., I have a server that receives HTTP POSTs which are multipart forms, the form having 3 parts, a JSON part and two file parts. The two files parts get written to files on S3 and the JSON part to SQS. The web framework is Falcon[1] and I also made use of a Cython-based HTTP form parser[2]. Concurrency is handled via gevent. Openresty sits in front and invokes the Python server via uwsgi. At the time I developed it, asyncio was not yet mature and not supported by boto3. I benchmarked against pypy but unsurprisingly (since it's I/O bound) got better performance and from CPython + gevent.

If I were developing it from scratch today, I'd re-evaluate the asyncio story, or more likely than not, choose a different language.

I don't doubt that there's use-cases to which asyncio is well-suited and the right choice, but I suspect folks may be using it in cases where they'd be fine with threads. As always, there are trade-offs.

1. https://falconframework.org/

2. https://pypi.org/project/streaming-form-data/ (I think)

For me it's not about efficiency. Using asyncio is just easier than threads.

* One coroutine can only interrupt another one at a point clearly marked with await (or async for or async with). That makes it easier to avoid data races without explicit synchronisation like locks.

* It's much easier to spawn async tasks and avoid them getting lost than with threads, assuming you use asyncio task groups (either by using a future version of Python, or using the anyio library now, or using Trio instead of asyncio).

* Async operations all have first class support for cancellation, and this interacts really cleanly with task groups. That helps with things like time outs, clean shutdown of your program, or cleaning up all resources related to a connection when that connection is closed.

* There's a bit more boilerplate in spawning threads and exchanging messages with them and joining them than the is spawning async tasks, especially when using task groups. (Admittedly, this is a solvable problem, and there are probably good libraries out there to help with this.)

well that's the thing with threads, you shouldn't be "spinning them up on the fly", you should have a fixed pool of threads. That then involves some architectural work up front (like 5 lines of code, ugh) and that's where everyone (under age 40) yawns and goes off to use asyncio instead (which oddly enough has a worker thread running in the form of the event loop, it's just all been presented nicely).
It sounds like you're talking about a different situation. The parent comment was taking about thread-per-connection with blocking IO read call on each. Yes that means spinning up and shutting down threads as connections open and close, and that is 100% a valid strategy. If you have a fixed pool of n threads and you get n+1 connections then you're just going to have to ignore one at any given time (potentially causing deadlock depending on the relationship between the connections) or end up using a multiplexing API at which point you're not far off from async world anyway.

Maybe you're talking about just submitting independent work items to run concurrently – yes async won't help much with that, because you're in the most trivial situation possible.

In more complex situations, with interrelationships between tasks (/threads), async syntax and task groups definitely has a huge impact. And, as I said, that's before you even get into how much easier it makes cancellation.

in my largely non-scientific experience, Python starts to fall over at about 50 threads, 1000 seems impossible but I haven't really tried.
1000 threads was a very specific use case I probably shouldn't have generalized from where I needed to match the number of Python threads running in a web server to a Java process running on the same host using the same number of threads. They were mostly idle.

There's no reason Python should fall over at any number of threads. You just usually end up either running out of memory or (more likely) saturate a single CPU core well before that number of threads.

Without consulting my notes I can't recall why I didn't use gevent on that project.

The GIL gets even more complicated than that because it can also be released during CPU bound tasks that don't interact with python objects (e.g. Array operations in numpy)
Right, if your cpu tasks are running in native extensions, then threading will actually allow parallelism, whereas asyncio will not.
I understand async/await in Python to be entirely single-threaded. So is, for example, C#'s implementation: https://learn.microsoft.com/en-us/dotnet/csharp/programming-... ("The async and await keywords don't cause additional threads to be created.")
Eliminates blocking on IO requests, letting the event loop spend CPU cycles doing non-IO work. Alternative is CPU doing nothing while waiting on IO, which for something like a web app doing lots of small network requests to database/cache can add up to a lot. CPU work is still single-threaded.
> I thought Python couldn't multithread because of GIL?

Why would it need to in this case? You only need one thread for concurrent I/O.

Can anyone recommend a good book/primer on "concurrency models" (is that a term?) for a self-taught programmer?

While I am self-taught, I'm used to (academic) books that strive for completeness. It is also what I prefer. Rather than something more pragmatic like a blog post.

It doesn't mean I want to read overly complicated prose on the subject, which I'm sure is possible.

I don't have a book recommendation ready, but if you want to see how async can be used in a large codebase, have a look at the telethon [1] library. It's a python library for telegram and one of the few that actually implement MTProto. It's huge, generates a large chunk of its machinery automatically from the MTProto specs and is extremely (!) well structured.

This is much more useful than the typical "let's write a single-run example with async" blog post.

[1] https://github.com/LonamiWebs/Telethon

The book https://pragprog.com/titles/pb7con/seven-concurrency-models-... is actually pretty good, and much better than the title may suggest.
I think you should start by reading how "async" really works… that's call on a poll() (or epoll on linux) function, a loop, and a list of "call this function when this file descriptor can be written/read".

The whole async thing is there to abstract away and not have the program structured around the main loop… but in reality you have to keep in mind you are in a main loop that calls poll() and then all the registered functions.

Async is overengineered and bolted on. If you must use Python, I'd still recommend Twisted, which is more accessible. Otherwise, of course use Go, Elixir, etc. in the first place.
I find Twisted less accessible and more opaque than async. It's also more "bolted on" in that it's an entirely separate library/framework outside the standard lib.

Async might technically be bolted on, but no worse than async in most languages which weren't designed de novo for async (eg go/elixir).

Promising contenders include algebraic effects in OCaml and JVM's Project Loom.
It reminds me of the cooperative multitasking used by the original Mac OS.
I don't know. It seems to me Elixir requires a new and rather restrictive programming paradigm. For me, that's even worse than being overengineered.
> new

Erlang has been around since what, the 80s? Elixir is "just" Erlang with a different face and extra features.

> restrictive

which is? Functional programming? Immutability?

Interestingly, Erlang is often called a "true" object-oriented language thanks to its actor model. It's incredibly powerful and flexible, pretty much the opposite of restrictive. Just for a simple example, you can inspect, debug and modify your program while it's running.

From your comment it just seems you're not familiar with it.

I am really interested in this space.

There's an article that Cal Paterson wrote that async doesn't speed up code - it is not parallel. The GIL prevents Python from being parallel. So even if you create a thread to run an async method in Python, it shall not run in parallel to the main thread of execution. (In fact, it shall block the main thread of execution if you start a thread in the thread you are in, due to the blocking run_in_executor)

https://calpaterson.com/async-python-is-not-faster.html

I wrote a multithreaded userspace 1:M:N scheduler (1 scheduler thread, M kernel threads and N lightweight/green threads) which resembles Golang M:N model. I implemented the same design in Rust, C and Java. I am thinking it could be combined with my epoll-server and it would be an application server.

https://github.com/samsquire/preemptible-thread https://github.com/samsquire/epoll-server

I am also interested in structured concurrency. This article by Vala developers is good.

https://verdagon.dev/blog/seamless-fearless-structured-concu...

I am trying to find a concurrent software design that is scalable and is easy to write and hides complicated lock programming. I document my studies and ideas in the open in ideas4.

https://github.com/samsquire/ideas4

I've implemented multithreaded parallel multiversion concurrency control in Java, which is the same approach used by Postgresql and MySQL for concurrent read and writing to the same data atomically.

I still think concurrency is hard to write and understand. Even with async/await.

// 3 requests in flight

result1 = async_task1();

result2 = async_task2();

result3 = async_task3();

await result1;

await result2;

await result3;

I ported a parallel multiconsumer multiproducer ringbuffer from Alek

https://www.linuxjournal.com/content/lock-free-multi-produce...

I use Python threads in https://github.com/samsquire/devops-schedule and https://github.com/samsquire/parallel-workers to parallelise a topologically sorted graph of IO of devops programs. This allows efficient scheduling and blocking with thread.join() for each split of the work graph and then a regrouping before doing other things, also potentially in parallel. This pattern is efficient and easy to use.

> await result1;

> await result2;

> await result3;

Not really, you only have *1* request in flight.

And you're waiting for them sequentially.

You need asyncio.gather ( https://docs.python.org/3/library/asyncio-task.html#asyncio.... ) if you want to run tasks concurrently.

results = await asyncio.gather(result1, result2, result3)

Before the first await result1 the coroutine objects are in flight.
Nope, creating the coroutines doesn’t schedule them for execution. That only happens on await. Python is not eager. If you want that behavior you need to use create_task. It doesn’t work like spawning a thread and waiting on them.

From the docs: https://docs.python.org/3/library/asyncio-task.html

> Note that simply calling a coroutine will not schedule it to be executed:

Oh this is not what I expected.

I think on C# you can await threads which is similar to a join() with a return value.

  // 3 requests in flight

  result1 = async_task1();

  result2 = async_task2();

  result3 = async_task3();
Depends on implementation, some are eager, some are lazy.