| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by degrews 357 days ago
	It's because those markets are based on the LLM Arena leaderboard (https://lmarena.ai/), where Claude has historically done poorly. That eval has also become a lot less relevant (it's considered not very indicative of real-world performance), so it's unlikely Anthropic will prioritize optimizing for it in future models.

2 comments

kmacdough 357 days ago

Anthropic has always been one of the best at not optimizing for stupid metrics. Rather, they spend significant energy researching weaknesses and building metrics around that. Google is also pretty on point IMO, but they can also afford to dedicate to these nonsense metrics as they are still good marketing.

Meanwhile Meta and Xai are behind the ball and largely marketing focused.

link

ttroyr 345 days ago

True. I'm surprised they are not based on e.g. OpenRouter usage or similar.

link