Hacker News new | ask | show | jobs
by anilakar 2868 days ago
Yet another asyncio tutorial that shows you to run a few sleep tasks concurrently. Can we finally get one that shows how to do real stuff such like socket programming, wrapping non-async-compatible libraries and separating cpu-intensive blocking tasks to awaitable threads?
5 comments

> such like socket programming

That's one of my biggest pet peeves (and if you see my other comments, you'll notice I have quite a few).

To do socket programming in asyncio, you can either use:

- protocols, with a nice reusable API and an interface that clearly tells you where to do what. But you can't use "await". You are back to creating futures and attaching callback like 10 years ago.

- streams, where you can use async / await, you but get to write the entire life cycle by yourself all over again.

I get that protocols are faster, and match Twisted model, and I get that streams are pure and functional, but none of this is easy. I use Python to make my life easier. If I wanted extreme perfs I'd use C. If I wanted extreme pureness I'd use Haskell.

> wrapping non-async-compatible libraries and separating cpu-intensive blocking tasks to awaitable threads

That's the one of the things asyncio did right. Executors are incredibly simple to use, robust and well integrated.

Problem is: they are badly documented and the API is awkward.

I won't write a tutorial in HN, but as a starting point:

You can use:

    loop = asyncio.get_event_loop()
    future = loop.run_in_executor(executor, callback, arg1, arg2, arg2...)
    await future
If you pass "None" as an executor, it will get the default one, which will run your callback in a thread pool. Very useful for stuff like database calls.

But if you want CPU intensive task, you need to create an instance of ProcessPoolExecutor, and pass it to run_in_executor().

I say it's one of the things asyncio did right because the pools not only distribute automatically the callbacks among the workers of the pool (which you can control the number), but you also get a future back which you can await transparently.

> Problem is: they are badly documented and the API is awkward.

That's my main problem with asyncio right now, bumping into problems and trying to find how to fix them by looking into the documentation is rather difficult. The documentation right now feels more like a documentation for an unsupported old library.

Also it's awkward that you cannot resolve kwargs on the functions that you pass to asyncio, like the callback in run_in_executor. You have to wrap the function in a partial resolving all kwargs and then send it to the executor.

I'm curious of your take on Trio or Curio. Do either address the peeves you've outlined?

https://github.com/python-trio/trio https://github.com/dabeaz/curio

Trio is a better Curio, so you don't really need Curio anymore. It was what started everything and deserves credit for that though.

As for Trio, it's what asyncio should have been from the beginning, at least for the high level part (although not for the pet peeves of socket programming: it's too low level for Python IMO)

The problem with Trio is that it's incompatible with asyncio (minus some verbose quirky bridges), so you get yet another island, yet another ecosystem. So what, now we get twisted, tornado, gevent, qt, asyncio... and trio ?

The madness must stop.

And that's why I think there is a better way: creating a higher level API for asyncio, which enforces the best practices and make the common things easy, and the hard things decent.

A complete rewrite like Trio would be better (e.g: it famously handles Ctrl + C way better and has a smaller API). But this ship has sailed. We have asyncio now.

asyncio is pretty good honestly. But it needs some love.

So, considering asyncio is what we have to work with, and by experience, it's quite great if you know what you are doing, I advice people to actually write a wrapper around it.

If you don't feel like writing a wrapper, I'll plug in the one I'm working on in case people are curious: https://github.com/Tygs/ayo

It:

- is based on asyncio. Not a new framework. Integrated transparently with normal asyncio.

- implement some lessons leaned from trio (e.g: nurseries, cancellation, etc)

- expose a sweet API (running blocking code is run.aside(callback, args, *kwargs), and automate stuff it can (@ayo.run_as_main setup the event loop and run the function in one row)

- make hard things decent: timeout and concurrency limit are just a param away

I does need some serious doc, including a rich tutorial which features a pure asyncio part. Also it needs some mix between streams and protocols. I'm not going to skip that part, I think it's mandatory, but I'll need many months to make the whole thing.

Now, I am not Nathaniel or Yury, so my work is not nearly as bullet proof as theirs. I would not advice to install ayo in prod now, but I think it's a great proof of concept of how good asyncio can be.

And we most certainly can do even better.

> If I wanted extreme perfs I'd use C. If I wanted extreme pureness I'd use Haskell.

Haskell concurrent socketry is decent:

https://kyle.marek-spartz.org/posts/2014-08-26-concurrent-im...

It's hilarious when a single comment on HN opens asyncio more than the tutorial being discussed.
Well, honestly, I think most HN readers go to the comments before the actual article.

I know I do.

The whole value of this website is that we got 1000 of experts in their fields, ready to give you their insight.

HN isn’t what it was a few years ago, but it’s still a hell of a lot better than Hackernoon.
Have you considered contributing improvements to the documentation, or even the API? Python is an open source project.
Yes. If you are not part of the club, it takes approximatly 18 months from a post on python-idea to an implementation, after so much debate it's madening. And most of the time gets rejected.

The process of contributing to python is more frustrating than writing for wikipedia.

Much easier to write something in pypi, then come back to python-idea once it gets popular.

Well, with doc updates I think you'll probably find the process much smoother, and it's a good way of getting trust among the "club".
That's fair. I'll get in touch with stinner on the next pycon, I think he is the guy for that.
If anyone wants to see some small, practical asyncio code in action, here's a little LMTP daemon I wrote recently:

https://git.sr.ht/~sircmpwn/lists.sr.ht/tree/lists-srht-lmtp via https://git.sr.ht/~sircmpwn/lists.sr.ht

Or getting deeper, another project which implements Synapse's[0] RPC protocol and encapsulates high-level RPC actions in asyncio sugar:

https://git.sr.ht/~sircmpwn/broca/tree/broca/connection.py

Code which uses this code:

https://git.sr.ht/~sircmpwn/broca/tree/broca/rpc.py

[0] https://synapse-bt.org

This architecture is made to match software threads to (logical) hardware threads, then have them loop over data separated into chunks that don't depend on each other.

If a function is blocking, needs the CPU and isn't thread safe, it can be wrapped in a message passing node that will get skipped over if a thread is already running it.

Every separate chunk of data that it creates will be dealt with concurrently and the high level structure can be put together in a graph that uses openGL.

https://github.com/LiveAsynchronousVisualizedArchitecture/la...

It is not in the style of a blog, but I put this together a few years ago to teach people how async programming works. It includes a handful of examples on generators, Coroutines, how to make use of the socket libs non blocking functionality, and how to tie them together to get the yield syntax folks are used to.

It doesn’t use any asyncio parts of python but is just meant to show what’s happening under the hood.

https://github.com/ltavag/async_presentation?files=1

...including error handling in async worker loops, please.