Hacker News new | ask | show | jobs
by HDBaseT 26 days ago
Deepseek V4 Pro is an amazing model, even without the unreal cost factored in.

It is my default model at the moment. I'm not doing anything too complex though. I honestly found more expensive models like Qwen 3.6 to fail in tasks Deepseek nails.

I'm interested in knowing what people are using for tasks which require a bit more thinking. Kimi 2.6? Qwen 3.7? GLM 5.1?

2 comments

I don't think there's any open models at the moment that can handle the more challenging stuff.

The things that I use Opus for at work is finding bugs in about ~200k lines of microservices and libraries in a niche language. So, we will get these bug reports that are missing context, can't easily be reproduced on our dev server, and are usually the result of something deep in multiple services/libraries combining with very custom configs. I can ask Opus (max thinking) to find what could cause the bug, and it usually nails it in a few hours (would take me 1-2 weeks to trace it myself). The end result will be like less than 10 lines of code to fix it, some tests to reproduce the bug and a nice report explaining it, so it can be checked in an hour or two.

17 GoLang microservices for a serious project were written perfectly using the latest version of QWEN(3.6). The only areas where we really had to work hard were documentation and a very serious task breakdown. All of this was tested, and yes, a review was required, but everything was within reason. The deadline was 10 days of 24/7 work, including the review. When attempting to submit the same task, Opus 4.7/4.6 had to be stopped after three hours. If you have significant resources for experimentation, you can certainly try. For us, the choice is absolutely clear at this point.