Hacker News new | ask | show | jobs
by espadrine 314 days ago
It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.