|
|
|
|
|
by theturtletalks
479 days ago
|
|
I personally use Aider's Polyglot Benchmark [0] which is a bit low-key and not gamed just yet. It matches my experience too where Claude Sonnet 3.5 is the best and still beats the new reasoning models like o3-mini, DeepSeek, etc. 0. https://aider.chat/docs/leaderboards/ |
|