| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by littlecranky67 2052 days ago

I'm sorry to disappoint you, but your benchmark methodology is flawed. You did not consider TCP congestion control/window scaling. TCP connections between to peers are "cold" (=slow) after the 3-way handshake, and it takes several roundtrips to "warm" them up (allow data to be sent at a level that saturates your bandwidth). The mistake you (and most other people performing HTTP load benchmarks) made, is that the Kernel (Linux, but also all other major OS Kernels) caches the state of the "warm" connection based on the IP adress. So basically, when you run this kind of benchmark with 1000 subsequent runs, only your first run uses a "cold" TCP connection. All other 999 runs will re-use the cached TCP congestion control send window, and start with a "hot" connection.

The bad news: For website requests <2MB, you spend most of your time waiting for the round-trips to complete, say: you spend most of the time warming up the TCP connection. So its very likely that if you redo your benchmarks clearing the window cache between runs (google tcp_no_metrics_save) you will get completely different results.

Here is an analogy: If you want to compare the acceleration of 2 cars, you would have race them from point A to point B starting at a velocity of 0mph at point A, and measure the time it takes to reach to point B. In your benchmark, you basically allowed the cars to start 100 meters before point A, and will measure the time it takes between passing point A and B. Frankly, for cars, acceleration decreases with increasing velocity; for TCP its the other way around: the amount of data allowed to send on a round trip gets larger with every rountrip (usually somewhat exponentially).

1 comments

kdunglas 2052 days ago

Hi, and thanks for the feedback.

I'm aware of this "issue" (I must mention it in the repo, and I will). However, I don't think that it matters much for a web API: in most cases, inside web browsers, the TCP connection will already be "warmed" when the browser will send the first (and subsequent) requests to the API, because the browser will have loaded the HTML page, the JS code etc, usually from the same origin. And even if it isn't the case (mobile apps, API served from a third-party origin...) only the firsts requests will have to "warm" the connection (it doesn't matter if you use compound or atomic documents then), all subsequent requests, during the lifetime of the TCP connection, will use a "warmed" connection.

Or am I missing something?

Anyway, a PR to improve this benchmark (which aims at measuring the difference - if any - between serving atomic documents vs serving compound documents in real-life use cases) and show all cases will be very welcome!

littlecranky67 2052 days ago

What you say would be true if images/js/css is truely served by the same IP adresse (not hostname!). In reality, people use CDNs to deliver the static assets like images/js/css, and only the API calls are used to warm up the TCP connection to the actuall data backend. Also things like DNS load-balancing would break the warm up, because the congestion control caches operate on IPs, not hostnames.

Additionally, its really hard to benchmark and claim it is "faster". You will always measure using the same networking conditions (latency, packet loss rate, bandwidth). So if a benchmark between two machines yields faster results using technology A, the same benchmark may return complete different results for different link paramters. Point being: Optimizing for a single set of link parameters is unfeasible, you'd have to vary networking link conditions and find some kind of metric to determine what means "fast": Average across all paraters? Or rather weighted averages depending on your 98th percentile of your userbase etc.

Regarding improving the benchmarks: It is really hard, since (a) docker images cannot really modify TCP stack settings on the docker host and (b) client and server would have to flush their TCP congestion control caches at the same time, and only after both flushed the next run can be conducted.

EDIT: Regarding serving static assets to warm up the connection: In that case, you'd have to include page-load time to download that assets in your meassurement (including time to parse+execute JS) and overall time comparison. Switching the API prototocol from REST to something else will probably not have that big of an impact on the total load time then. Saying: If you spend 80% of your time downloading index.html, css, some javascript etc. and querying your API accounts only 20% of the time, you will only be able to optimized on that 20%. Even if you cut load times for the API calls in half, overall speedup for the whole page load would be 10%.