|
|
|
|
|
by pron
42 days ago
|
|
I think you're referring to the less important point I made. Correcting for apples-to-apples is harder and less valuable. Having more domain coverage is easier and more valuable (especially since the current coverage is so narrow and largely irrelevant to most software). BTW, what we do is compare our suite of micro-benchmarks to our (much smaller) suite of macro-benchmarks. This way we get at least some sense of how relevant the microbenchmarks are (i.e. we're looking at the correlation of the deltas). Some microbenchmarks are more correlated with the macrobenchmarks than others. If an optimisation helps some microbenchmarks that we think are not representative of many programs and doesn't help with any macrobenchmark - we take it out. Just to give an example, we may want to measure some optimisation that helps some allocation pattern. Sometimes it turns out that if that pattern is diluted by other allocation patterns the program does for other tasks, the advantage is completely erased. Some optimisations in free-list allocators are particularly susceptible to this: if your program allocates only in this specific way, it will be super fast. If, in addition, there are some sporadic allocations that follow a different pattern, then after an hour you'll see performance start to drop. |
|
Hopefully, some of the target audience might try to confirm that programs are what they think of as "comparable".
> Having more domain coverage is easier and more valuable…
So where are the examples of that being done? (It's been decades.)