Y
Hacker News
new
|
ask
|
show
|
jobs
by
armcat
25 days ago
This person did a great comparison against Qwen models, and despite them having 8x less active params, they outperform the Cohere model in every category:
https://x.com/DJLougen/status/2057196012918149368?s=20