| I think you might not be familiar with the package used to benchmark Julia [1]. It does not fix processes to CPU's, or set kernel governor to performance, and there are fluctuations from usage of the computer. But it does run the function for several seconds and returns the distribution of the runs (the little graphics underneath the benchmarks). It calculates standard deviation and if some runs are too small (sub-nano seconds) it emits warnings saying the results might be caused by inlining and constant propagation. The differences in runtimes you refer to are from use of different machines or different routines, which is completely expected. They also argue they need to run the Mojo code in the same machine as the Julia code to be able to give meaningful results and comparisons. While to someone outsider it might be seen as done without care, I can asure you that this people are used to take extreme care on how they do benchmarks. Again, it might just be that you're not familiar with the tooling developed to do it. I do think there is more benchmarks needed to be done, as the Mojo code hasn't be optimised yet and none in that thread was able to run both the Julia code and Mojo code in the same machine (outside of the OP). But I'm sure this will be done (I guess rather sooner than later). :) [1] Documentation of the package used for benchmark https://juliaci.github.io/BenchmarkTools.jl/stable/
Here you can find all the information you have said in your comment, and more, about reproducibility of benchmarks in different environments.
White paper about the strategies used by the package https://arxiv.org/abs/1608.04295 |