| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Tichy 5413 days ago
	Except that event-driven io is actually the simpler solution compared to threads, isn't it?

2 comments

jbooth 5413 days ago

Depends. For an application where you have significant CPU work, a simple worker pool (not coded by you), with an accept loop

while (s=accept()) { dispatchToWorkerPool(s) }

And an application-programmer method handleConnection(s) is pretty darned simple. Just do blocking I/O from your threads and rely on the machine to swap in and out appropriately.

For the next level, you can have an evented i/o layer that passes sets of buffers back and forth to your worker pool. That's more complex but still spares application programmers from having to worry about either eventing or threads, they just worry about passing back the right bytes.

Here's where things get interesting, though. It turns out someone studied this (in Java) and actually found that stupidly throwing 1k threads at the problem with blocking i/o performed better than a clever nonblocking server, due to linux 2.6's threading and context-switching improvements.

http://www.mailinator.com/tymaPaulMultithreaded.pdf

Some of that might be specific to Java's integration with linux, but think about it.. the linux guys did a pretty good job at getting the right thread to wake up and assigning it to runnable state. Moving to evented i/o might be the right move for your workflow but it also might make no difference at the cost of some additional complexity.

tl;dr context switching has gotten cheaper/better as linux has improved, 16 CPUs vs 1 CPU changes the math about context switching vs using select(), and worker pool libraries mean you shouldn't actually manage threads yourself.

link

dap 5413 days ago

That thread pool isn't actually that simple. How many threads do you use? If you throw 1,000 threads at the problem with a 2MB stack each, that's 2GB of DRAM you've thrown away (instead of 20MB * ncores per Node process) -- DRAM that could be caching filesystem data, for example, which could have a huge impact on overall performance.

With Node, the DRAM and CPU used scales with the number of cores and actual workload. With a thread pool, the DRAM used scales at least with the number of concurrent requests you want to be able to handle, which is often much larger than you can handle simultaneously (because many of them will be blocked much of the time).

Assuming you're not willing to reserve all that memory up front, the algorithm for managing the pool size also has to be able to scale quickly up and down (with low latency, that is) without thrashing.

link

damageboy 5413 days ago

What a load of crap... you just wasted 2gb of address space not of DRAM... You'll "waste" exactly up to the amount of stack each thread uses, rounded up to PAGE_SIZE which is usually 4kb

Let me guess, a nodejs fan?

link

dap 5413 days ago

You're right -- it's the touched pages that count. In many cases, that's a few MB per stack, which is what I said.

link

mst 5413 days ago

Alternatively, create the socket, fork N processes and have each of said processes run an accept loop. Assuming they're CPU bound, this is going to be pretty much just as efficient and you don't have to prat about with threads -or- evented I/O.

Ain't UNIX grand?

link

j_baker 5413 days ago

It depends, but I don't think either one is inherently simpler than the other. As a general rule of thumb, I find that asynchronous solutions are generally better for I/O bound stuff while threads are better for computationally expensive stuff.

link