Y
Hacker News
new
|
ask
|
show
|
jobs
by
snemvalts
29 days ago
What about other benchmarks? Benchmarks where the contents are freely available have become useless for evaluating models.