Hacker News new | ask | show | jobs
by ivank 4328 days ago
You need select or a select-like function to know which sockets are readable/writable without tying up one thread per socket
1 comments

I know what select/epoll and friends are for.

What's wrong with tying up one thread per socket?

"Context switching is expensive. My rule of thumb is that it'll cost you about 30µs of CPU overhead. This seems to be a good worst-case approximation. Applications that create too many threads that are constantly fighting for CPU time (such as Apache's HTTPd or many Java applications) can waste considerable amounts of CPU cycles just to switch back and forth between different threads."

http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-ma...

On 32-bit systems, you can also easily run out of address space for your thread stacks.

I don't trust those measurements. They never fully explored what is due to cache effects and what is due to context switching. You'd have heavy cache effects if you switched data between client contexts after select returns, too. So just because spinning on a futex show a max 30us wasted CPU doesn't mean you won't waste that with select as well.

Also in case of an IO bound thread, they are not just spinning on a futex aimlessly. There is a different mechanism, so should really benchmark with a more characteristic workload.

Speaking of characteristic workload, they should have probably also measured on a tickless kernel since I saw they complained about time quanta and HZ=100. Well recent kernels are tickless so they'll behave differently. (Might be worse even).

> On 32-bit systems, you can also easily run out of address space for your thread stacks.

Well don't run large servers with so many threads on 32 bit systems ;-) Many database vendors don't even package for or support 32 bit versions of Linux.

Sorry, I haven't bought into the whole "async is always better" trend. Some (ex?) Senior Google engineer (Paul Tyma) agrees with me:

http://www.mailinator.com/tymaPaulMultithreaded.pdf

Async / select pattern is usually good where there is very little business logic. Like a router, proxy or simple web server and so on. In a large application having a giant dispatch call at the center of it, with callbacks branching out is not a healthy pattern.

In those cases I like the coroutines/user-space-threading. It gives you the reduced cost of having a single or a few threads without the heavy toll of callbacks.
Just to expand on the timing and cost argument:

When you have 10,000 tasks and about 8 cores (give or take a few) the number of context switches is very large. Switching in the kernel will happen mostly in the system call boundary of blocking IOs and require the scheduler to make a decision on what thread to wake up next and then change the running process.

This can be seen in function context_switch inhttps://github.com/torvalds/linux/blob/master/kernel/sched/c... without the arch dependent components and can hardly be compared in complexity and effort to switching between 4 and 8 registers in user-space.

The above still doesn't include any changes to the TLB and memory protection tables as I assume the OS optimized those away when it switched between two threads of the same program. An optimization I'm not sure that happens normally.

Probably nothing. {taking aside performance issues when you run into the thousands of parallel connections}

I personally prefer the old-school async approach, because there you are forced to explicitly manage your connections' state, and the application/process-wide data access is inherently race-condition free. I'd use this as far as possible.

If you let your OS schedule threads, obviously you have to be careful that shared data is correctly locked/only atomically changed, but you get parallelism (especially for CPU heavy tasks) for free. If you are used to do these chores (I'm not), perfect! And your connections state (or the state of required computations) can be arbitrarily complex (ugly?) and still quite elegantly hidden in your threads's stack.

So, I don't see that one approach is better than the other. For me the extremes are probably clear in favor of one or the other, with a large grey area in between.

Using async to avoid race conditions due to multiple threads was great 15-20 years ago when single CPU machines ruled, but now you need multiple threads or processes for concurrency. Using multiple processes is usually not an option due to lack of any shared state (and if you're trying to share state across processes you should probably just use threads).
In addition to the problems discussed in the other replies, it's (nearly?) impossible to correctly cancel an operation while blocked on a socket.
Have you tried:

* closing the socket (it would be from another, monitoring thread in this case)

* setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO)

* Would that be different than blocking on select with a an infinite timeout. How do you cancel that? Or are you relying on other sockets getting constant stream of data to wake you up?

* do something with an ALARM signal

Closing the socket results in a race condition. You might close the socket, and then another thread opens a new file or socket and happens to get the same fd as your socket used to hold. Now your reader reads from some random unaffiliated fd it shouldn't be touching, causing all sorts of havoc.

A timeout will work fine, but now you're polling, meaning you have an unpleasant tradeoff between efficiency and how long it takes for your thread to notice that it's dead.

Canceling select or any other multi-fd call is really easy. Create a pipe and add it to your fd set. Any time you want the thread to wake up (e.g. because you need to tell it that you're canceling something) you just write to the pipe.

Signals have a similar race condition as closing the socket. If the signal is delivered after you check for cancellation but before you enter the system call, you'll hang.

> Closing the socket results in a race condition.

That is true. To go more in-depth, you'd do shutdown first. But I think you have to be connected for that.

> Canceling select or any other multi-fd call is really easy. Create a pipe and add it to your fd set. Any time you want the thread to wake up (e.g. because you need to tell it that you're canceling something) you just write to the pipe.

That a good way, agree. But I would still use a select with 2 file descriptors per thread. One fd for the pipe and one for socket itself. Each thread handles its own request and processing as needed without having one global dispatch in the whole application. Pipe is exposed to the outside in case shutdown needs to be triggered (from another thread).

2 fds per thread works well. Unfortunately, this means you hit select's performance problems twice over, since the performance scales with the maximum fd you pass it, not the number of fds. But as long as that's OK for what you're doing, it's a nice way to arrange things.