They say it's because of "The different to other benchmark implementations is, that the given task is not known before the data was available" here: https://github.com/nextapps-de/mikado/blob/master/bench/READ...
I'm not sure what they mean by that.