Hacker News new | ask | show | jobs
by starship731 456 days ago
Deepseek v3-0324 (new checkpoint) beats ALL but 1 proprietary AND non-thinking LLMs by a significant margin. Check livebench.ai & Artificial Analysis benchmark for details.

The only non-thinking LLM the new V3 doesn't decisively thrash is GPT 4.5 which is more than 100 times more expensive than V3 and yet is only a few (essentially negligible) percentage points better than it.

1 comments

They said "all proprietary" models, not "all but 1 proprietary, non-thinking" models. It doesn't beat all the models!

It's pretty good, especially nice since it's open source, but it's not going to be a daily driver for most people.