Hacker News new | ask | show | jobs
by w_t_payne 3281 days ago
I wouldn't measure one thing. I'd measure a whole heap of different metrics, do some sort of (possibly nonlinear) mapping so bigger numbers indicate (intuitively) better performance, then take deciles. I'd measure performance by how high each developers' worst metric is. I.e. look for consistency across all metrics and ignore high performance in a small number of areas (to make it harder to game). Also, it might be worth training a model to predict that performance metric from other unstructured data -- again to make it harder to explicitly game the system.