Hacker News new | ask | show | jobs
by jasondigitized 72 days ago
What would be the incentive to engage in the tactic when the proof is ultimately in the pudding when the model hits the streets? Who would ultimately benefit from fudging these numbers?
1 comments

Anthropic would def benefit as benchmarks are almost always quite useless vs real life use.
How specifically would they benefit. People flock to them based on the hype and then the model sucks and they leave?