Hacker News new | ask | show | jobs
by zzzeek 3336 days ago
> we make heavy use of asyncio because it’s more performant

more performant than....what exactly? If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms? Answer: no. async only gives you throughput, it has nothing to do with "faster" as far as the Python interpreter / GIL / anything like that. If you aren't actually spanning among dozens/hundreds/thousands of network connections, non-blocking IO isn't buying you much at all over using blocking IO with threads, and of course async / greenlets / threads are not a prerequisite for non-blocking IO in any case (only select() is).

it's nice that uvloop seems to be working on removing the terrible performance latency that out-of-the-box asyncio adds, so that's a reason that asyncio can really be viable as a means of gaining throughput without adding lots of latency you wouldn't get with gevent. But I can do without the enforced async boilerplate. Thanks javascript!

10 comments

I'm glad you said this. There's an async cargo cult going on, where every service must be written in "performant" async code, without knowing the actual resource and load requirements of an application.

From the last benchmark I ran [1] async IO was insignificantly faster than thread-per-connection blocking IO in terms of latency, and marginally faster only after we hit a large number of clients.

Async IO doesn't necessarily make your code faster, it just makes it difficult to read.

[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...

A ~20% improvement in throughput and latency while using 50% less memory (which could allow more workers per-box) is not a "marginal" improvement in my book.
const users = await getUsers();

const tweets = await getTweets(users);

console.log(tweets);

Is async code really harder to read?

Javascript's async feels a bit more natural than Python's.

In Python, you've also got to run the event loop and pass the async function to it. This makes playing with async code in the interpreter more difficult. Also don't forget that async is also turtles all the way up (same as in JS). It'll infect any synchronous code that touches it.

I've written a Tornado app which makes heavy use of asyncio, and while it's pretty efficient, I would reconsider writing it the same way if I had to go back in time.

It's not bad anymore with async/await and promises/futures, but that featureset is still bleeding-edge in most languages. Older-style async code was much more annoying.
In your example the async code doesn't really help anything though - the next statement has to wait for the response from the previous one before continuing.

In your example you'd probably want to be using Promise.all to run two IO operations simultaneously.

The next statement has to wait, but the runtime can yield to another waiting async task so you aren't blocking the total throughput of your program (assuming it's async-all-the-way-down).

The benefits are generally larger-scale than a single method.

Thats hardly applicable async code. You're awaiting the actual async operations, which originally have to be distributed asynchronously from the main thread for these async operations to execute, and at that point its the same speed as just doing sync operations inside of an async operation.

Actual asychronocity, usually with event based systems, gets very ugly, very fast, because you end up having to make callback chains and queueing up your async work. There can be a good benefit to doing it, but its going to be a lot less readable than most sync code, and sometimes not any faster, in the case of Node.JS and its community forcing the usage of async function in places where they don't need to be used.

That code probably represents one function in a event loop webserver processing more than one request at a time. Non blocking behavior is important for work involving UIs.
Throw an exception and look at the stacktrace.
This looks pretty readable to me

https://repl.it/H547/2

Sorcery. Why don't my JS stacktraces look nice? :(
Depends on you dev environment. Almost all of the browser dev tools should catch up eventually. The fun of an ecosystem with multiple competing implementations.
Heh. Somebody will assemble a few of these pieces, add a package manager for async oriented libs, call it node.py, and then market it a bit.

Then you'll really be irritated.

That's... actually not a bad idea. ᕕ( ᐛ )ᕗ
I know it will have an ORM called NodeAlchemy

/calls lawyers

well it can also make things faster. well in your example it won't. but consider you need to load 4 requests and do operations on each of them. if you schedule them in an async fashion you can begin operating on the first one that's ready and not the first one you defined. and this is also often the case. a website does not just do one request to the database. mostly it runs multiple ones and often they don't interfere. like getting 20 rows and the count as a whole, there is just no need to start the first and wait till you have 20 rows and then start the second. you should always start both and wait till you have both.

yes it does not magically make your fetching 100 rows faster or your pbkdf2()/bcrypt() function. you still need to wait for those.

> if you schedule them in an async fashion you can begin operating on the first one that's ready and not the first one you defined.

This type of operation is a given in any production quality webserver, whether it runs with multiple threads and blocking IO or using a non-blocking approach with greenlets. For a web application, this is an implementation detail that should not be explicit within the request handling code (a request handled in the context of a web container after all is a package of data in, a package of data out. no network reading/writing is usually exposed to the web application unless it's trying to expose IO handles to the app, which is unusual). Easy enough with something like Gunicorn.

I think you're talking about different things; the idea is not that you can multiplex the requests coming in, but also the requests going out to the database and etc for each web request handling function.
So on that topic, a request typically has a single transaction going out to the database so within the scope of the request, has to perform its steps in serial in any case. If it needs to make several requests to web services that aren't dependent on each other, that's an area where you can get into stacking them with some kind of concurrency construct (I'd pass it into a greenlet oriented worker pool). but this is already going to be a heavy web request with multiple web service calls.
> a request typically has a single transaction going out to the database

Its typical because people are still in a "single thread single transaction ORM crud" model of thinking. "Its linear because thats how it is"?

if you're using an ACID kind of database then yes, that's how it is :)
> this is already going to be a heavy web request with multiple web service calls

Sure, but it can now be a less heavy web request! ¯\_(ツ)_/¯

> a request typically has a single transaction going out to the database

The fact of the matter is, as applications develop, become richer, and grow larger, it becomes less and less uncommon to have more than one query per page. Especially in the context of larger organizations, it's very common to have everything wrapped behind a service call with an entire armada of infrastructure hidden behind it, and having to make many service calls to put together one web API result or page.

---

sigh Slight tangent. Look at where we are now and how we came here.

Back in the non-ajax days we used to do them all on the server side, then render the whole page all in one go. This would have come in handy back then! Imagine doing 5x 50ms queries asynchronously, dropping a 250ms response delay down to 50ms! But this stuff was hard back then, and we mostly left it alone.

This is also along the times when we figured out that since we can have pages that take a long time to load and block the interpreter, perhaps it's not such a great idea to serve many requests with a single interpreter, so people started using stuff like nginx to run multiple python interpreters in parallel (not even getting into threads here), which was easier to reason about since each python process is a separate universe that can block entirely, but overall we can still serve a new request with a new interpreter, so for the most part things are good.

Then the twisted people thought that this was silly, and why should we block in the first place, and they decided that the way to fix this was to change the way we program entirely, and re-create or wrap an entire ecosystem of software. It sort of worked, except there wasn't a good twisted package for your thing. But all in all it worked.

Then the greenlets (or one of its other 20 names) people came and wanted to instead use fine-grained implicit concurrency, and said "no no, we can get something with nicer abstraction packaging while mostly not changing the code we have", and that was even nicer, except when something didn't get monkey patched correctly for some reason. We got stuff like gunicorn, which was impressive.

Then as we moved more stuff to the client to create more responsive (in the original meaning of the word) applications, so we pushed the burden of requesting and fetching data to the browser side, which means that as a page loads, it might call REST APIs one by one (hopefully asynchronously!), each of which might make a single (finer-grained) database or service call behind the scenes.

So how different is this now from the gunicorn model? In the latter, you get fine threads of control, each working asynchronously to fetch their own thing, which gets put together in the server side, and then sent back to the client. In the former, you get similarly fine threads of control, but the fine threads perhaps live in their own universes, and it doesn't get all put back together until it travels over the internet to the browser.

So it's a little bit different, but overall what's happening is similar. It feels like we just keep moving concerns and procedures up and down the stack.

Surely there's reasons for all this. Times and technologies change, and we find ways to adapt. I like the "async" stuff because it makes things explicit. It's the middle-ground result of the culmination of our learnings that hiding async behavior makes libraries hard to design and can result in frustrating and unpredictable behavior, whilst changing the entire programming model isn't great either. So we get asyncio. I'm mostly happy with this result. Admittedly this article isn't doing any of this justice.

> it becomes less and less uncommon to have more than one query per page.

I said transaction, not query. A database transaction is on a single connection at a time and queries are performed via the transaction serially.

Will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms

If you have to do 1000 queries it could, since could async will make it feasible to do them parallel. If it's a single query, maybe async would make it feasible to shard the database.

you usually see this pattern in ORMs with n+1 querys . If a single request requires 1000 db queries it is better to be optimising the query
It buys you the stack size of each thread which only matters if you have a stupid amount of connections. In this article[1] the author makes a comparison between the 2 models and 7000 concurrent users will chew up 450MB of stack space. Of course this is adjustable.

[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...

On most Linux systems stack is allocated with mmap with overcommiting. Until first write all those pages will share same zeroed page AFAIK. Then only overwritten pages will be allocated.

Am I wrong?

How do you save on stack space with asyncio? Don't you have to keep the coroutine object in memory somewhere?
I think the idea is that these "coroutine objects" (or the equivalent structure in whatever language) is smaller than the typical stack size for a thread. For example, the default stack size on Windows is 1 MB. So if you have a thread per connection, obviously this is going to take up a decent amount of memory. I'm guessing the answer to this is a thread pool so your memory usage doesn't blow up.

https://msdn.microsoft.com/en-us/library/windows/desktop/ms6...

> more performant than....what exactly? If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms?

Potentially, it depends on if you can do other tasks for the same request that don't depend on the data. You might be able to render most of the page for instance. It's not purely about throughput.

Please tell me that 300ms was made up too and that it's not really taking that long.

https://magic.io/blog/uvloop-blazing-fast-python-networking/... from the makers of uvloop (for a toy example)

it seems the main bottleneck when using aiohttp is aiohttp itself, which practically makes the use of uvloop irrelevant

If you have to make several requests to db backend to fulfil one response then potentially asyncio allows you to make them in parallel rather than in series. Reducing latency of your response.
> If I need to load 1000 rows from a database and splash them on a webpage, will my response time go from the 300ms it takes without asyncio to something "more performant", like 50ms? Answer: no

Well, actually, yes. Without async rendering, your webpage is not ready until your 1000 rows of list is placed in Python memory then rendered to HTML as a whole then returned to your browser after like 300ms of server cost.

With async rendering, your webpage's headers and such can be returned immediately, thus your first-byte-to-response time can be done under 50ms, and your page loads by enumerating the rest of 1000 rows and renders the page incrementally.

Well you can do all of that sync, can't you?

    def on_connection:
        send(headers)
        send(start of page)
        for row in db:
            send(row)
        send(footer)
will have the exact same effect as what you said (not like that applies regardless, I don't think jinja outputs partial renders, since its made for flask)

The performance comparison is between python managed green threads, and OS managed actual threads. You don't get any new features

Another point is your server can switch context to handle other requests with async.

In real world, your web page consists more than one db (like mysql + redis + some RPC calls to microservices) queries, with async apis, you can concurrently request for all queries at once and join them all at rendering.

The async benefits can add up to a much faster responsive server.

Yes, those are threads when handled by the OS / greenthreads when handled by the program.

a program with threads can support multiple requests simultaneously. a program with green threads can support multiple requests simultaneously.

You arn't giving any reasons why green threads in python perform better than threads in the OS.

Well, threads also switch context.
That's a client streaming optimization, not related to the subject at hand which is non-blocking network IO. Assume the service returns a JSON structure. It won't get to the end any faster.
There must exists a module like `ijson` which could incrementally generate JSON.
I went down this rabbithole once, and turns out you /can/ do something like this, having everything streaming all the way from the database to python to the web server to the client. The problem then was that even after all that effort, whatever javascript usually was processing that in a non-streaming way.

Then I found this http://oboejs.com/ and it was even more work, and I gave up. In the end it required rethinking everything and battling against a whole set of tools and libraries that just didn't think that way.

You are the hero we need, Mike