Hacker News new | ask | show | jobs
by omazurov 1765 days ago
> If you're spawning one million threads, each of which will perform a blocking operation, then you're likely not CPU bound

Imagine a DAG with million nodes. Each node takes data from all its input edges, processes it and sends to all its output edges. Now that I have light threads I want to use them to implement all nodes. Edges are naturally implemented as blocking queues. If I feed enough data into this structure I get a CPU-bound workload with a lot of concurrency (and scalability). Yet the nodes will have to block on their input queues because data will not be ready in most cases.

Now, to complicate the design, I want to have I/O tasks that read data from a file/network and feed it into the DAG and send its resulting output to the UI thread. Should I use different "threads" for different parts of the design? I'd hate to, especially given the fact that all those parts are dynamic in nature.

So, of course, I'm not currently spawning a million threads to execute all those tasks but Loom says I could do that for all my tasks and my concern is that to get decent performance I'd have to explicitly separate my "CPU-bound" tasks from my "I/O-bound" tasks and use different kinds of threads for each (which I'm kind of doing right now but again the promise was...).

2 comments

In that case you should be fine. It should work as well as goroutines and async/await in other languages.

As long as you do IO or other blocking operations which block for a significant amount of time (ms instead of ns), Loom will be beneficial. Otherwise, you'd do better with regular threads in a pool, but even then I'd imagine the performance difference to be tiny.

It's not answering your question (indeed what you've described is actually very similar to my entity simulation example), but it sounds like the actor model might be a good fit, FWIW.