|
First off, awesome to see more benchmarks (even if it's just personal experimentation) for synchronous vs asyncio performance. I think the real argument for asyncio right now is that it makes it very easy for you to write extremely efficient code, even for hobbyist projects. Even though your experiment is only handling 320 req/s, that you were able to do that so quickly and with very, very little optimization is, I think, a testament to the potential for asyncio. Some pointers: The event loop is still a single thread and therefore subject to the GIL. That means that at any given time, only one coroutine is running in the loop. This is important for several reasons, but probably the most relevant are that 1. within any given coroutine, execution flow will always be consistent between yield/await statements. 2. synchronous calls within coroutines will block the entire event loop. 3. most of asyncio was not written with thread safety in mind That second one is really important. When you're doing file access, eg where you're doing "with open('frank.html', 'rb')", that's something you may want to consider moving into a run_in_executor call. That will block the coroutine, but it will return control to the event loop, allowing other connections to proceed. Also, more likely than not, the too many open files error is a result of you opening frank.html, not of sockets. I haven't run your code with asyncio in debug mode[1] to verify that, but that would be my intuition. You would probably handle more requests if you changed that -- I would do the file access in a run_in_executor with a max executor workers of 1000. If you want to surpass that, use a process pool instead of a threadpool, and you should be ready to go, though it's worth mentioning that disk IO is hardly ever cpu-bound, so I wouldn't expect you to get much performance boost otherwise. Also, the placement of your semaphore acquisition doesn't make any sense to me. I would create a dedicated coroutine like this: async def bounded_fetch(sem):
async with sem:
return (await fetch(url.format(i)))
and modify the parent function like this: for i in range(r):
task = asyncio.ensure_future(bounded_fetch(sem))
tasks.append(task)
That being said, it also doesn't make any sense to me to have the semaphore in the client code, since the error is in the server code.[1] https://docs.python.org/3/library/asyncio-dev.html#debug-mod... |
> You would probably handle more requests if you changed that -- I would do the file access in a run_in_executor with a max executor workers of 1000.
This is really good point. I'm going to check this and edit post adding this information there.
> Also, the placement of your semaphore acquisition doesn't make any sense to me. I would create a dedicated coroutine like this:
looking into my semaphore code next day after writing it I do wonder if I'm using it correctly. I assumed it works correctly because it fixed my "too many open files" exception, so it seems to mean that I'm no longer exceeding 1024 open files limits. Can you clarify why you think my use of semaphore does not make sense and why your suggestion is better? What is the benefit of dedicated coroutine?
> That being said, it also doesn't make any sense to me to have the semaphore in the client code, since the error is in the server code.
I admit that I focused more on my client than server. One thing that worries me about my test server is that it does not print any exceptions. Either it does not fail at all, which seems unlikely, or it fails silently, which is more likely and is bad. So I need to check my server code to see what exactly happens there.
> it also doesn't make any sense to me to have the semaphore in the client code, since the error is in the server code.
main reason for semaphore in client code is that it should stop client from making over 1k connections at a time. My logic here is that if client wont make 1k connections at a time - server wont receive 1k connections at a time and thus there will be no problem of too many open files on server (it won't have to send more than 1k responses). However I see that this logic may not be totally correct, other comment points out that it's possible for sockets to "hang around" after closing: https://news.ycombinator.com/item?id=11557672 so I need to review that and edit post.
> https://docs.python.org/3/library/asyncio-dev.html#debug-mod...
this looks really great, will look into this thanks.