| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jdw64 3 days ago
	Personally, when I use open code or routers, I feel that beyond a certain level, the models don't make a huge difference to me. Except for expensive and mediocre models like Gemini. In that sense, Chinese models are pretty good. I usually write code in function or method units and then design and assemble them together. GPT series models are more thorough and better, but I'm not sure if the difference is enormous. It seems to depend on the workflow, but in my opinion, if you are thorough enough, I wonder if there really is a big difference

4 comments

sjanes 3 days ago

I've kind of given up on the routers for "free" inference, as you would expect, they tend to give you sub-par thinking because they are obviously trying to conserve as much inference as possible.

I've had some success turning my macbook M1 pro into a heating pad with Qwen 3.6 35B A3B MTP. Trying to use Gemini models "locally" resulted in a similar "short shrift" of effort resulting in mistakes and lots of turns. The reports of Fable being relentlessly "proactive" shows you can go the other direction as well, if you have strong enough branding and effective invoicing.

ignoramous 3 days ago

> I've kind of given up on the routers for "free" inference, as you would expect, they tend to give you sub-par thinking because they are obviously trying to conserve as much inference as possible.

Xiaomi MiMo ($6/mo: https://platform.xiaomimimo.com/token-plan) & Alibaba Qwen ($50/mo: https://www.alibabacloud.com/en/campaign/ai-scene-coding) have generous limits on fixed subscriptions.

MaKey 3 days ago

So does Opencode Go ($10/mo: https://opencode.ai/go) for DeepSeek v4 Flash and MiMo 2.5.

apitman 3 days ago

That looks pretty nice. How does it compare cost-wise to just using OpenRouter?

arcanemachiner 3 days ago

The Go plan essentially gives you $50 of inference for $10 per month ($5 for the first month).

ignoramous 3 days ago

$60/mo currently: https://opencode.ai/docs/go/#usage-limits

Their limits are staggered: 5h (max $12), weekly ($30), monthly ($60).

WalterGR 3 days ago

> The reports of Fable being relentlessly "proactive"

For the curious: https://news.ycombinator.com/item?id=48498573 - “Claude Fable is relentlessly proactive”.

mft_ 3 days ago

Tangent: did the MTP help you at all? I’ve tested that model back to back on my M1 Max MBP and the MTP version was actually marginally worse. I wonder if I didn’t use the right settings, although I tried several based on the obvious sources.

onlyrealcuzzo 3 days ago

In my experience, there's little difference between implementing individual functions between frontier models and SotA ~30B param models.

Once you have a coherent design (the hard part), you can feed it to a pretty small model and get basically the same quality.

They'll not one-shot, but they're faster and cheaper, so it still works out in your favor.

Plus you can do it locally...

jdw64 3 days ago

I have a similar experience. However, when including code review, I think the GPT model is the most impressive

regularfry 3 days ago

The difference in outcome isn't that big but yes, you need to be more rigorous. For instance I've found that the Kimi K2.5 and K2.6 models will comment out failing tests rather than fix a problem they just caused (mistaking them for "pre-existing failures"), so you need to specifically make commented-out tests break the build. I've not personally had that problem with any of the Anthropic or OpenAI models.

torginus 3 days ago

I wonder why it's the natural tendency of models to BS or do stuff like this when they don't have the correct answer - it's clear that they can program refusal into them, but for some reason, refusal has to be injected after the fact, and models can't really arrive at the conclusion that they can't answer properly.

Eridrus 3 days ago

I assume it's a lack of care when RLing them.

RL has a tendency to reinforce cheating when the cheats are easier to find than the final solution.

So when making your RL environment, you need to spend a lot of effort on finding ways the model can cheat and penalizing them.

lotharcable 2 days ago

probably because there is a ton of open source projects out there with disabled tests in their training data.

dcreater 3 days ago

I really hope we stop using the term "Chinese models". It has this air of Negative connotation. It's the equivalent of calling cars Japanese, which people used to do but now is almost entirely meaningless. You just call them Toyota, Honda, Lexus etc.

esafak 3 days ago

I don't think "Chinese" is pejorative in this context any more than "American" is. They are one of the two ecosystems. What's wrong with saying "Japanese cars" today?

kennywinker 3 days ago

> What's wrong with saying "Japanese cars" today?

Only that it’s a fairly meaningless grouping. When japan first entered the car market in north america there might have been some commonality, but now what characteristics do they share that some american cars don’t have? They’re not even imported a lot of the time.

Given that, it does start to feel tinged with racism if someone insists on grouping things together that don’t really belong together.

As for Chinese LLMs, the term doesn’t “feel” pejorative to me - but i also don’t see a totally clear set of attributes they share. Not all are open-weight. Some are small and can be run on consumer hardware, some are huge. They even have a variety of answers to what happened june 3rd 1989

Brendinooo 3 days ago

> now what characteristics do they share that some american cars don’t have?

Typically the answer is "reliability", which is a positive trait, which makes the original callout about negative connotations very odd to me.

overfeed 3 days ago

Chinese AI models also share a positive trait: they offer more bang for the buck.

kube-system 3 days ago

> When japan first entered the car market in north america there might have been some commonality, but now what characteristics do they share that some american cars don’t have?

They're unique in that they even make a regular passenger car. American manufacturers only make SUVs and a couple of sports/luxury cars. They basically gave up because the Camry/Corolla/Accord/Civic ate their lunch.

The cheapest sedan you can get from an American brand is the Cadillac CT4.

cheema33 1 day ago

> but now what characteristics do they share that some american cars don’t have?

The difference is quite big in my opinion. When given the option to pick a Japanese vs American vehicle for about the same price/features, most people will pick the Japanese vehicle. American vehicles have improved over the years, but quality and reliability are generally better for Japanese vehicles even today.

antonvs 3 days ago

> but now what characteristics do they share that some american cars don’t have?

Better overall design?

dcreater 3 days ago

Sadly there is a pejorative context. The constant us, the free world vs China, the evil Soviets rhetoric from every major news establishment and executive creates that negative view

fuck_google 3 days ago

On the other hand the Trump administration has successfully managed to make Chinese seem better than American, so there might not be that much of a pejorative context any more..

antonvs 3 days ago

You're right, but the bias in the US certainly persists. "China = bad" is an assumption that many people still make without any self-reflection about the ways in which the US is now at least as bad.

hootz 3 days ago

For me, it has a positive connotation! In my experience, Chinese Model means cheaper, but still quite effective model you can use for millions of tokens without burning your entire wallet in seconds. That's why I get more excited over a Chinese model release over American models.

odiroot 3 days ago

Japanese cars is actually a positive qualifier. I'd say anything Japanese motor-powered.

ffsm8 3 days ago

Maybe he's just from an alternative universe. Chinese model isn't negative either after all.

unethical_ban 3 days ago

No thanks.

The term seems to have the connotation of "competitive at 1/10 the price of Claude", so I don't see the problem.

It's not Harbor Freight Chinese (and heck even they have decent stuff sometimes now too).

You don't think people still talk about Japanese cars as a distinction in quality from US or European ones?

sroerick 3 days ago

I don't know, I tried using one of the Chinese models and it was VERY quick to scan my entire home dir, so maybe your threat surface is a little different than mine

fooker 3 days ago

Models can't scan anything.

They return instructions for you to do something, and you or a script you permit chooses to execute what the model tells you and return the result to the model.

jdw64 3 days ago

You are right. I agree.It may seem like a kind of bias, but I hadn't thought of that part. Thank you for pointing out my bias.

theanonymousone 3 days ago

"You're absolutely right"?

jdw64 3 days ago

"You hit the nail on the head" LOL