poll is fairly portable, but still has scaling problems. To get beyond scaling problems, you usually need something OS-specific, such as epoll (Linux), kqueue (BSDs of some kind), etc. One of the things I'd expect of a cross-platform networking library would be to use the best available abstraction, especially on major platforms such as linux, and perhaps fall back to select on odd platforms. Of course, your docs need to point out what the behavior is.
The problem with poll/select is that you need to pass (transfer) to the kernel the entire list of events you're interested in, just to wait for a single event. Each time you re-enter your event loop you need to do this, and if you have thousands of connections, it will break down, because poll is O(number of connections).
epoll, for example, works around this by having an "epoll FD", which on the kernel side contains all the events you're interested in. You wait by just waiting on that epoll fd with epoll_wait, but you don't specify the events you're interested in when you wait: you do that ahead of time, with other system calls. This allows you to change the event list only when you need to, which is much less frequent than some data arrived from somewhere. The API is supposed to be O(1), instead of O(number of connections) per wait call.
My understanding is the kqueue works similarly, but I'm a Linux guy, so I can't really tell you.
select also has other problems w.r.t. FDs with high numbers.
yeah...epoll and kqueue in my experience are easily interchangeable. I built a server on FreeBSD and the port to Linux was straightforward. My first event-based socket usage was on Windows NT around 1999. When we ported the server parts to Linux, replacing with epoll was also straightforward.
Plus kqueue gives you ways to wait for other events (timeouts, signals etc), and epoll+other linux specific calls does too. This simplifies your code a lot.
libev - http://software.schmorp.de/pkg/libev.html -is an existing library that provides async networking IO, over a number of backends including (I think) kqueue, epoll and in the worst case, select.
this competes with libevent, libev, and libuv... all of which use the best method for the platform where it's installed.. so kqueue on BSD, epoll on Linux, etc.
That's one of the big reasons to use a lib for this.. so you get the best performance, without having to change your code to get it (or bother detecting which is best, etc).
I think the only viable (in terms of portability) alternative is poll. There's a pretty good comparison of the two written by the author of cURL [0]. But essentially it's the same speed and doesn't have a hard-coded FD limit, but it runs on fewer platforms.
I sent patches in for curl about 10 years ago to switch from select to poll to get around the 1024 FD limit (which was a problem for multi-threaded servers that handled many sockets).
I like epoll over poll because you don't need a central point in your application that knows about all FDs, each component can manage it's own FDs registration with the OS.
"Context switching is expensive. My rule of thumb is that it'll cost you about 30µs of CPU overhead. This seems to be a good worst-case approximation. Applications that create too many threads that are constantly fighting for CPU time (such as Apache's HTTPd or many Java applications) can waste considerable amounts of CPU cycles just to switch back and forth between different threads."
I don't trust those measurements. They never fully explored what is due to cache effects and what is due to context switching. You'd have heavy cache effects if you switched data between client contexts after select returns, too. So just because spinning on a futex show a max 30us wasted CPU doesn't mean you won't waste that with select as well.
Also in case of an IO bound thread, they are not just spinning on a futex aimlessly. There is a different mechanism, so should really benchmark with a more characteristic workload.
Speaking of characteristic workload, they should have probably also measured on a tickless kernel since I saw they complained about time quanta and HZ=100. Well recent kernels are tickless so they'll behave differently. (Might be worse even).
> On 32-bit systems, you can also easily run out of address space for your thread stacks.
Well don't run large servers with so many threads on 32 bit systems ;-) Many database vendors don't even package for or support 32 bit versions of Linux.
Sorry, I haven't bought into the whole "async is always better" trend. Some (ex?) Senior Google engineer (Paul Tyma) agrees with me:
Async / select pattern is usually good where there is very little business logic. Like a router, proxy or simple web server and so on. In a large application having a giant dispatch call at the center of it, with callbacks branching out is not a healthy pattern.
Probably nothing. {taking aside performance issues when you run into the thousands of parallel connections}
I personally prefer the old-school async approach, because there you are forced to explicitly manage your connections' state, and the application/process-wide data access is inherently race-condition free. I'd use this as far as possible.
If you let your OS schedule threads, obviously you have to be careful that shared data is correctly locked/only atomically changed, but you get parallelism (especially for CPU heavy tasks) for free. If you are used to do these chores (I'm not), perfect! And your connections state (or the state of required computations) can be arbitrarily complex (ugly?) and still quite elegantly hidden in your threads's stack.
So, I don't see that one approach is better than the other. For me the extremes are probably clear in favor of one or the other, with a large grey area in between.
Using async to avoid race conditions due to multiple threads was great 15-20 years ago when single CPU machines ruled, but now you need multiple threads or processes for concurrency. Using multiple processes is usually not an option due to lack of any shared state (and if you're trying to share state across processes you should probably just use threads).
* closing the socket (it would be from another, monitoring thread in this case)
* setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO)
* Would that be different than blocking on select with a an infinite timeout. How do you cancel that? Or are you relying on other sockets getting constant stream of data to wake you up?