| The author also points out that some of the benchmarks poorly represent real workloads: "Bottom up (since the worst offenders are now first), binary-trees is silly since it measures allocation speed for a case that simply doesn't exist in real code; thread-ring is basically insane, since nobody ever bottlenecks like that; chameneos-redux's C++ implementation is ridiculous. The C is not so ridiculous, but you still have the problem that basically every language in the top few spots does something completely different; pidigits tests whether you have bindings to GMP; regex-dna tests a regex engine on a small subset of cases (arguably the first half-acceptable benchmark); k-nucleotide tests who has the best hash table for this particular silly scheme, and they don't all even do the same thing (eg. Scala precompacts, like my new Rust version); mandelbrot is kind'a OK; reverse-complement would be kind'a OK if not for a few hacky implementations (like the Rust); spectral-norm is kind'a OK; Haskell basically cheats fasta (which is why I copied it); meteor-contest is too short to mean anything at all; fannkuch-redux is probably kind'a OK, n-body is kind'a OK. So maybe 5/13 are acceptable, and I'd still only use 4 of those. I think if looking at mandelbrot, spectral-norm, fannkuch-redux and n-body you can argue the benches are a reasonable measure of peak performance. However, these cases are also all too small and simple to really be convincing either, nor is it particularly fair (where's NumPy for Python?)." https://users.rust-lang.org/t/blog-rust-faster/3117/12?u=acc... |
Have you looked at the benchmarks game website?
Please show where the benchmarks game website claims that those tasks simulate "real workloads" (whatever that means).
You will see "Your application is the ultimate benchmark" and "These are just 10 tiny examples" and …
http://benchmarksgame.alioth.debian.org/dont-jump-to-conclus...