Hacker News new | ask | show | jobs
by RussianCow 9 days ago
I've been doing the same, though admittedly out of curiosity more so than lack of funds. The open models are catching up quickly in their abilities, to the point where they're (mostly) not doing stupid stuff regularly, but you have to be very specific about what you want. I found that Opus, for example, is much better at asking me to clear up ambiguity in a request before starting, whereas the Chinese models tend to "fill in the blanks" and make their own assumptions.

My current workflow involves going from PRD -> execution plan -> build -> review, and this works nicely with open weight models like GLM 5.1, Kimi K2.6, and DeepSeek V4 Flash. With Opus I can generally skip the PRD entirely, and sometimes even skip the plan, and 80-90% of the time it does exactly what I want. But that can easily burn $5-15 for one feature, whereas it'll cost maybe $1-2 with the open weight models (at API pricing).

2 comments

> ... you have to be very specific about what you want. I found that Opus, for example, is much better at asking me to clear up ambiguity in a request before starting, whereas the Chinese models tend to "fill in the blanks" and make their own assumptions.

That's the main thing I've noticed. Small models can follow instructions just fine. If the instructions are very specific. Then I often have to spend more time explaining a task than it would have taken me to do it myself.

The bigger models have a lot more common sense.

I wonder if that could be improved slightly through prompting. Asking it to clarify anything that's confusing. Or maybe it just makes incorrect assumptions without realizing the ambiguity. One way to find out!

This is my observation as well with deepseek by flags. It takes too much initiative, and is often not particularly smart. Yet, I find it is so fast and good at iterating/correcting it's mistakes that it eventually finds the way on its own.

Though, I tend to use it as a pair programmer so just stop it and provide guidance.

The real problem is that it is excessively verbose - it's impossible to keep up with it's train of thought, and not practical to read it all. So I tend it just let it do it's thing then skim a bit and skip to the end for it's summary.

Try opencode go subscription - you get the Chinese models for 6x discount. I use like $1 a day...