| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by estsauver 456 days ago
	Is that Qwen 2.5 or Qwen 3? I don't see a qwen 3 on the aider benchmark here yet: https://aider.chat/docs/leaderboards/

2 comments

aitchnyu 456 days ago

As a human who asks AI to edit upto 50 SLOC at a time, is there value in models which score less than 50%? Im using the `gemini-2.0-flash-001` though.

link

manmal 456 days ago

The aider score mentioned in GP was published by Alibaba themselves, and is not yet on aider's leaderboard. The aider team will probably do their own tests and maybe come up with a different score.

link