Hacker News new | ask | show | jobs
by zzzeek 1334 days ago
well that's the thing with threads, you shouldn't be "spinning them up on the fly", you should have a fixed pool of threads. That then involves some architectural work up front (like 5 lines of code, ugh) and that's where everyone (under age 40) yawns and goes off to use asyncio instead (which oddly enough has a worker thread running in the form of the event loop, it's just all been presented nicely).
1 comments

It sounds like you're talking about a different situation. The parent comment was taking about thread-per-connection with blocking IO read call on each. Yes that means spinning up and shutting down threads as connections open and close, and that is 100% a valid strategy. If you have a fixed pool of n threads and you get n+1 connections then you're just going to have to ignore one at any given time (potentially causing deadlock depending on the relationship between the connections) or end up using a multiplexing API at which point you're not far off from async world anyway.

Maybe you're talking about just submitting independent work items to run concurrently – yes async won't help much with that, because you're in the most trivial situation possible.

In more complex situations, with interrelationships between tasks (/threads), async syntax and task groups definitely has a huge impact. And, as I said, that's before you even get into how much easier it makes cancellation.

yes, parent was referring to "each thread with a blocking connection", but you still can (and probably should) use a thread pool for that. In the naive approach, new connections beyond the limit of your threadpool either have to wait, or you have to dynamically expand your threadpool. mod_wsgi's daemon mode has the option to use a thread pool of a fixed size to handle requests.

you can also use non-blocking handles with a fixed /dynamic threadpool and use epoll or similar to find those handles with data ready, and send those into your pool, thereby servicing an arbitrary number of connections with a controlled level of concurrency among them. MariaDB has an option to do that here: https://mariadb.com/kb/en/thread-pool-in-mariadb/ . this is not as trivial as spinning up asyncio tasks but that's because there's (AFAIK) no friendly library giving you an easy way of doing it. But it's Python and if you're writing a server to handle MariaDB server loads using a thread pool with direct use of epoll(), you're likely in the wrong language.