Y
Hacker News
new
|
ask
|
show
|
jobs
by
veselin
815 days ago
I think this is simply the default of lm-evaluation-harness. They said they ran every single benchmark they could out of the box.