| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by terra_t 5797 days ago

Yeah, but there's a fetishization of "high concurrency" (being able to support a huge number of connections) rather than absolute performance.

For instance, you might have a system which has a latency of 1 second, and at a given workload, you have 10,000 connections. In the Java culture, people think you're a genius if you can increase those connections to 100,000 and increase the latency to 10 seconds.

End users, on the other hand, would be happier if you cut the latency to 0.1 seconds, but there are a lot of people who'll then think you're a loser who can only manage to handle 1000 concurrent connections.

Of course, getting that latency down is a holistic process that requires you to think about the client, the server, and what exactly goes over the wire.

1 comments

jacquesm 5797 days ago

If you could increase the number of connections to 100,000 you would indeed be a genius because when you bind to a network interface using IPV4 there is a hard limit of the short integer used to indicate the port number which automatically limits you to 65536 connections (actually a few less, usually you'll lose 3 for stdin,stdout and stderr (which you can close to reuse them) and one for the listen socket).

As far as I know the only way around this is to use multiple IPS (possibly aliases on the same interface) but that would still require a new process.

So even if your per-process limit for fds can be larger than 64K the network layer or the mapper that turns fds in to socket ids for the network stack to work with may impose a restriction. I don't know enough about the linux kernel to figure out what exactly causes this.

I use the 64K limit on some high throughput machines (mostly video and image servers), but when I go over that I need to start another process. Possibly there's a way around that but the expense of another process is fairly small so I haven't put in much time to see if I can work around that. Socket to fd mapping presumably takes in to account the address as well as the port so it shouldnt't be a problem but on the kernel of the machines where I have to resort to these tricks it appears to be a limit.

Maybe someone with more knowledge of the guts of the linux kernel can point out why this happens.

link

jbeda 5797 days ago

TCP connections are identified by the (src ip, src port, dest ip, dest port) tuple. The server only needs one port. So theoretically a server can handle 64k connections per client.

link

carson 5797 days ago

You can see this in the 1M connection test done here: http://www.metabrew.com/article/a-million-user-comet-applica... Look at the "Turning it up to 1 Million" section where he details the need to use 17 IPs for the client side.

link

thwarted 5797 days ago

Yeah, and that's on the client side, as is indicated by the first sentence of that section:

Creating a million tcp connections from one host is non-trivial.

The key words being "from one host". With a single client machine connecting to a single server endpoint, the (src ip, src port, dest ip, dest port) is reduced to being unique only on src port (from the client's perspective), so that's where the 65k limit, and the need for more IPs to do that, comes from. Using multiple source IPs on the same machine is like using multiple client hosts.

...using IPV4 there is a hard limit of the short integer used to indicate the port number which automatically limits you to 65536 connections (actually a few less, usually you'll lose 3 for stdin,stdout and stderr (which you can close to reuse them) and one for the listen socket).

The file descriptor limit is independent of the 65k total possible source ports. The source port limit is part of TCP/UDP. The file descriptor limit is set by ulimit (nofile in limits.conf) on a per-process basis and in /proc for system-wide. If you need more file descriptors, you can reuse 0, 1 and 2, but that's going to free up some ports so a single process can make more connections to the same server endpoint.

link

jacquesm 5797 days ago

Now that is a test. Thanks for posting that, it is the most interesting thing I've seen all day.

link

jacquesm 5797 days ago

But, a server can have multiple IPS, so a server should be able to handle more than 64K connections from multiple clients without a problem. In practice there appears to be some kind of limit.

link

adamtj 5797 days ago

The server doesn't need multiple IPs to handle > 65535 connections. All the server connections to a given IP are to the same port. For a given client, the unique key for an http connection is (client-ip, PORT, server-ip, 80). The only number that can vary is PORT, and that's a value on the client. So, the client is limited to 65535 connections to the server. But, a second client could also have another 65K connections to the same server-ip:port.

edit: You may be limited by number of open sockets or file handles. It's likely a per-process limit. Google or some linux guru could help you track down what limit it actually is, but it's not the number of server ports available. It might be a number you could raise.

link

jacquesm 5797 days ago

Right, that makes perfect sense. But it really makes me wonder why I run in to that hard limit, I've tried just about everything to get around it and no matter what I do that seems to be the magic number.

I should go and do some testing to see what's causing this, you make me feel like the solution is right around the corner.

re. your edit, ulimit will happily raise the number > 64K, all the /proc/* settings seem to be ok so that's not it, it has to be some other layer in the stack that causes this. I'll definitely spend some time on this, it's been bugging me for a long time.

edit2: there seems to be a max_user_watches upper limit to what epoll will handle.

link