|
No problem, it's especially hard to find external feedback for side projects and experiments so I try to give it when I can. > I assumed it works correctly because it fixed my "too many open files" exception It works, so at the end of the day that's what matters. The client vs server question, from my perspective, ultimately comes down to a question of test realism; in a real-world deployment you couldn't limit connections with client-side code because there are multiple clients. That's what I mean by "it doesn't make sense given that the error is server-side". > Can you clarify why you think my use of semaphore does not make sense and why your suggestion is better? What is the benefit of dedicated coroutine? I'm saying that mostly, but not exclusively, from a division of concerns standpoint. You're acquiring the semaphore in a completely different context than you're releasing it. On the one hand, that's partly a programming style issue. On the other hand, it can also have some really important consequences: for example, it's actually the event loop itself that is releasing the semaphore for you when the task is done. Because of the way the event loop works, it's hard to say exactly when the semaphore will be released. You want to hold it for the absolute minimum time possible, since it's holding up execution of other connections in the loop. Putting it into a dedicated coroutine makes it clearer what's going on, makes it such that the acquirer and releaser of the semaphore are the same, and means you are definitely holding the semaphore for the minimum amount of time possible (since, again, execution flow will not leave any particular coroutine until you yield/await another). In general I would say that releasing the semaphore in a callback is significantly more fragile, and mildly to moderately less performant, than creating a dedicated coroutine to hold the semaphore and handle the request. Does that all make sense? > Either it does not fail at all, which seems unlikely, or it fails silently, which is more likely and is bad. That's a fair statement, I think. As an aside, the print statement is slow, so keep that in mind. It might actually be faster to have a single memory-mapped file for the whole thing, and then just append the error and traceback to the file. The built-in traceback library can be very useful for that. That's also a bit more realistic, since obviously IRL you wouldn't be using a print statement to keep track of errors. On a similar note, because file access is so slow, you'd be best off figuring out some way to remove the part where the server accesses the disk once per connection entirely. On a real-world system you'd possibly use some kind of memory caching system to do that, especially if you're just reading files and not writing them. That allows you to use a little more memory (potentially as little as enough to have a single copy of the file in memory) to drastically improve performance. |
yeah it does make sense.
> in a real-world deployment you couldn't limit connections with client-side code
yeah that's a very good point. But in a real world scenario handling this would not be that easy. Limiting number of available connections on the server side is not a trivial task to implement. Setting your server to avoid failures and simply return either 503 service unavailable to some clients or 429 (too many conns) to others would probably require quite a lot of coding. It's also not very clear to me how this would be implemented, how do people implement things like this? Just putting some check for number of open files before line that opens file and setting response code to 500 and 429 before opening file? This would only stop server from opening to many "html" files, but would not stop server from getting flooded with connections. Is my aiohttp app even the right place to add checks like this? Wouldn't it be better to use haproxy or nginx or some other load balancing service in front of aiohttp app and let it handle too much traffic?
Other thing that comes to my mind (need to check this later) is that perhaps some partial "handling" of cases like this could/should be implemented in aiohttp library. I'm not sure how it behaves now, but maybe it should simply fail to open file, return 500 to the client, and print noisy traceback about open files to my logs? I didnt see this behavior when doing my tests, so either it didn't occur, is not implemented in aiohttp, or it occurrred and I somehow missed that. From my experience with Twisted I know that this is how Twisted resources behave, if you have some unhandled exceptions twisted just returns 500 to client and show traceback in logs.