Hacker News new | ask | show | jobs
by shenedu 4894 days ago
author here.

> Notice that in his tests 97% of these connections don't do anything, just idle. He maxes out at 18764 req/sec.

Yes, just testing how many concurrent connection can be held. When the 600k are held, ab confirms that it can do about 31405.53 per seconds, the http body is 1024bytes.

> Notice that they are "keep-alived", coming from the same IP, so not truly separate connections

Not from the same ip, from many ips: 192.168.1.200~230

> Keep in mind that 600K concurrent connections cannot possibly do anything useful at the same time for many reasons (CPU, bandwidth, server I/O), so they are not truly concurren

They send a request every 5s~30s to server, and wait for response

1 comments

> Not from the same ip, from many ips: 192.168.1.200~230

So from 31 IPs, which can be done with 31 keepalive connections.

Try hitting your server with even 50K real connections and see how long it lasts (if it lasts at all).

> They send a request every 5s~30s to server, and wait for response

Exactly. ALL of them don't do anything concurrently, they just sit idly.

You seem to be missing the point of the scenario they're testing.

Lots of idle connections (doing overlapping long polling) is exactly how many COMET servers work.

We send ~60 "events" via our COMET server (APE from www.ape-project.org) in a typical 2 hour period.

The server side work to decide when/what to send the clients is easy because it's the same information that gets sent whether there is 1 connection or 1,000,000.

The fact they're from just 31 different IP addresses isn't relevant. They're still individual connections from clients to the end server.

> The fact they're from just 31 different IP addresses isn't relevant. They're still individual connections from clients to the end server.

That's where you are wrong. Not only they are keepalive connections, they are completely local. Do it over an actual network from 50K different IPs and see how that performs.

Again you're missing the point. Just checking 31 sockets for data is much much much less work than checking 600k sockets, even if they are all via local IPs.

I agree that a connection from a local IP is not as much work for the kernel as from a remote IP, but it's the same amount of work for the server portion of the software to service each of the connections whether they are local or remote. Remember too that the host machine is running both the server and the process generating the client load. Generating the client traffic will be costlier than what is saved by the local traffic not traversing the full stack.

Yes, ideally two machines (one with a whole bunch of virtual IPs to fake the clients, and the other hosting the server) would be a better test, that way the machine hosting the server is going via the full network stack.

> Do it over an actual network from 50K different IPs and see how that performs.

And I don't see what difference (as far as how the networking performance of the server will vary) of having unique IPs or not will make. Incoming connections (from a real network) are going to cause the same amount of work regardless of the remote IP (assuming there are no DNS lookups); and iptables or firewall stuff should have minimal impact even if you spray a huge number of unique IPs at it.

For my testing of a similar scenario I use a couple of old blade servers (2 chassis of 24 PIII 700MHz blades each) to generate the load. Each blade has a unique IP and for 500,000 connections per blade I need 1M sockets (each connection can have two open concurrently as they overlap) = 41666 sockets per blade; that fits with a tweak to the ephemeral port range.

My server keeps long polling connections for ~25 seconds. The total network cost of each poll is ~800 bytes[1] (TCP connection initiation, HTTP request, HTTP response, TCP teardown). 500,000 polls every 25 seconds = 20,000/sec.

20,000 conns/sec * 800 bytes = 16,000,000 bytes/sec = 128,000,000 bits/sec.

Luckily each blade chassis has 3 x 100Mbps ethernet ports (Gigabit would have been nice but these are old blade servers) on separate backplanes (public, private, mgmt) so I split the 24 blades up with 8 on each interface to keep well below the 100Mbps limit of each port.

1. Which is why Websockets is much more efficient, roll on adoption in the popular browsers (not just the few who run relatively recent installs of Chrome/Firefox).