It would be interesting to hear about it from MS. Do you know of other settings / configurations / features that could greatly influence the result of such comparison?
Since we started replying to these points on the Github thread at https://github.com/dotnet/spark/issues/45, I am suggesting to continue the discussion there. As mentioned there, we want to be transparent with the benchmark code and systems we use. We are currently working on arrow support to compare fairly.