| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tel 5800 days ago

I think there's less disagreement about the well-controlled result derived and more about whether

   a) Your controls were right, and
   b) What the most optimal decision is in light of this new information.

(a) is well known to be one of the most difficult parts of scientific reasoning and is almost always open to endless debate and improvement. In short, it's the question of whether ATR is a human-sensible metric. (b) however has an interesting direct answer: figure out the distribution of "live" ATRs on an interesting population of real servers and then, to borrow Eliezer's phrase, shut up and multiply.

If a lot of servers that you're targeting with M2 fall across that 60% divide (under circumstances similar to your controlled microbenchmark) then of course Superpoll is a good compromise.

Jacques is arguing a combination of (a) and (b). Perhaps ATR is not a sufficient metric to understand all interesting server loads. Moreover, perhaps many interesting servers live at really low or high ATRs all the time and so Superpoll must gracefully degrade to either poll or epoll.

In any case, driving for empirical data is noble, but possessing data is never sufficient to whitewall all detractors. It's really nice to have strong empirical support for the breakeven point between the two (ie. the ratio of their constant time components) via your benchmark, but science isn't just statistics.

(edit: I'll also add that pushing the pipetest microbenchmark past where people are usually making hyperbolic claims is a pretty big deal and a good catch.)