Y
Hacker News
new
|
ask
|
show
|
jobs
by
lancebeet
118 days ago
If benchmarks are fishy, it seems their bias would be to produce better scores than expected for proprietary models, since they have more incentives to game the benchmarks.