Hacker News new | ask | show | jobs
by blueben 5797 days ago
Because "real world traffic" is a bullshit test. Who's real world traffic? Mine? Yours? Google's? Yahoo's? Science is repeatable, which means you can replicate the same inputs to the test every time. The lab experiments involved in science are often highly sanitized and not "real world" at all. Determining what the results mean in the real world comes later.

If you're going to complain about science, at least understand how it works.

3 comments

There are basically three sorts of traffic that I have experience with and would expect to be the major portion of whatever goes over the web:

- underwater ajax requests

- regular website content (images, dynamic html, css and other relatively small (say < 250K) files)

- media servers (filedumps, video servers, streaming audio servers)

Each of those requires fairly specific tuning of the TCP stack to get the most out of it, so you're not likely going to find all of these on one and the same machine unless it is a small operation (and in that case this whole discussion is moot).

A benchmark done in isolation is meaningless because in the end, real world traffic is what it is all about. So, I personally don't care whose site(s) you test with, as long as there are enough of them to get a statistically valid result.

Google's or Yahoo's would be fine with me, I've given my results above, if I have the time I'll do the same thing on a couple of other high volume sites.

I've (unfortunately) studied this problem quite a bit because of the size of the websites that I'm involved with and so far I've learned that you can play around on your testbench all day long it doesn't matter one little bit for production purposes unless you are very careful (such as in that other test linked from this page) to simulate users clients.

You could do a lot worse than to play back a log file in order to make an experiment repeatable. I assume that real world performance is what Zed is after, not theoretical performance.

You know what I find troubling about your behavior on this thread? It's just so weirdly manipulative. I gotta think you have like 40% stock invested in Epoll, Inc. or something. You make wild claims about what I'm saying that aren't true, you imply that I know nothing of real world performance when I've written some pretty bad ass real world software. You reply to every single thread with a constant stream of FUD and nitpicking everything you can then blowing it out of proportion.

I mean, are you sure you didn't used to work for Microsoft and then got hired by Linus to work your FUD spreading magic?

  Because "real world traffic" is a bullshit test [..]
The hell it is. A statistically significant sample of 'real world ...' is the foundation for most engineering decisions. When you build a bridge, you take the actual loads it has to support into account. Intel bases their chipbaking on the actual purity of the silicon their suppliers can provide.
Exactly, there's no concept of confounding at all. You use "real world tests" (whatever the hell that is) when you have an actual specific setup to test. You use a small model experiment like this to test one specific thing like poll vs. epoll.