|
|
|
|
|
by rorrr
4893 days ago
|
|
> The fact they're from just 31 different IP addresses isn't relevant. They're still individual connections from clients to the end server. That's where you are wrong. Not only they are keepalive connections, they are completely local. Do it over an actual network from 50K different IPs and see how that performs. |
|
I agree that a connection from a local IP is not as much work for the kernel as from a remote IP, but it's the same amount of work for the server portion of the software to service each of the connections whether they are local or remote. Remember too that the host machine is running both the server and the process generating the client load. Generating the client traffic will be costlier than what is saved by the local traffic not traversing the full stack.
Yes, ideally two machines (one with a whole bunch of virtual IPs to fake the clients, and the other hosting the server) would be a better test, that way the machine hosting the server is going via the full network stack.
> Do it over an actual network from 50K different IPs and see how that performs.
And I don't see what difference (as far as how the networking performance of the server will vary) of having unique IPs or not will make. Incoming connections (from a real network) are going to cause the same amount of work regardless of the remote IP (assuming there are no DNS lookups); and iptables or firewall stuff should have minimal impact even if you spray a huge number of unique IPs at it.
For my testing of a similar scenario I use a couple of old blade servers (2 chassis of 24 PIII 700MHz blades each) to generate the load. Each blade has a unique IP and for 500,000 connections per blade I need 1M sockets (each connection can have two open concurrently as they overlap) = 41666 sockets per blade; that fits with a tweak to the ephemeral port range.
My server keeps long polling connections for ~25 seconds. The total network cost of each poll is ~800 bytes[1] (TCP connection initiation, HTTP request, HTTP response, TCP teardown). 500,000 polls every 25 seconds = 20,000/sec.
20,000 conns/sec * 800 bytes = 16,000,000 bytes/sec = 128,000,000 bits/sec.
Luckily each blade chassis has 3 x 100Mbps ethernet ports (Gigabit would have been nice but these are old blade servers) on separate backplanes (public, private, mgmt) so I split the 24 blades up with 8 on each interface to keep well below the 100Mbps limit of each port.
1. Which is why Websockets is much more efficient, roll on adoption in the popular browsers (not just the few who run relatively recent installs of Chrome/Firefox).