Hacker News new | ask | show | jobs
by jzymbaluk 245 days ago
How ironic that these LLM's appear to be overfitting to the benchmark scores. Presumably these researchers deal with overfitting every day, but can't recognize it right in front of them
1 comments

I'm sure they all know it's happening. But the incentives are all misaligned. They get promotions and raises for pushing the frontier which means showing SOTA performance on benchmarks.