I don't know what exactly is going on, but last time I looked into it, it seemed the main difference between the codes was how quickly you can do 1/sqrt(x). Ultimately, I would like to see more numerical benchmarks and also compare more versions in a given language.
We started a repository for it:
https://github.com/fortran-lang/benchmarks/
But didn't have time to work on it yet. See the issues, e.g., at:
https://github.com/fortran-lang/benchmarks/issues/2
For some discussion how to best do that.