Hacker News new | ask | show | jobs
by glaslong 101 days ago
The marginal differences in quality seem pretty meaningful right now, enough to make Claude wildly dominant, but some of the locally runnable models like Qwen feel only a few months behind the leaders.

I'm betting the generational gains level off and smaller local models close the gap somewhat. Then harnesses will generally be more important than model, and proprietary harnesses will not offer much more than optimization for specific models. All while SaaS prices ratchet up, pushing folks toward local and OSS. Or at least local vs a plethora of hosted competition, same as cloud vs on prem.

1 comments

Qwen3.5 really is great.

But the biggest thing is going to be context. Whilst a 10gb card can run a 9b model with some context .. for coding you really want a lot of context.

So if paying 200 a year for 1T in context, vs your 32k context.. that's the thing I see as being the driver.

Personally ive found great success with using open code, having Opus as my plan agent, and omnicoder-9b as my build agent.

Get opus to plan, switch to omnicoder to build, switch back to opus to review. Etc etc.

Works great.

> for coding you really want a lot of context

After having gone all-in on LLM agents for a while, I'm not so sure anymore. An LLM with lots of context can sometimes generate more accurate code, but it can also hide decision-making from you, the person who actually has to maintain that code. If the LLM pulls in 1000 files to make a decision, that's no longer a decision that you can understand.