| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by BoredomIsFun 123 days ago
	> it consistently ranks extremely low on benchmarks As general purpose chatbots small Mistral models are better than comparably sized Chiniese models, as they have better SimpleQA scores and general knowledge of Western culture.

1 comments

seanmcdirmid 123 days ago

It’s really hard to beat qwen coder, especially for role play where the instruction following is really useful. I don’t think their corpus is lacking in western knowledge, although I wonder if Chinese users get even better results from it?

link

BoredomIsFun 123 days ago

> It’s really hard to beat qwen coder, for role play

I am not sure if you actually tried that. Mistrals are widely asccepted go-to models for roleplay and creative writing. No Qwens are good at prose, except for their latest big Qwen 3.5.

> I don’t think their corpus is lacking in western knowledge,

It absolutely does, especially pop culture knowledge.

link

seanmcdirmid 123 days ago

Instruct and coder just follow instructions so well though. I guess I’ve just never been able to make mistral work well, I guess.

link

BoredomIsFun 123 days ago

Qwen3 30B A3B and that big 400+ B Coder were absolutely terrible at editing fiction. I would tell them what to change in the prose and they'd just regurgitate text with no changes.

link

seanmcdirmid 122 days ago

Did you try asking Gemini what model to use and how to configure/set it up? It has worked wonders for me, ironically (since I’m using a big model to setup smaller local models).

link

BoredomIsFun 122 days ago

> Did you try asking Gemini what model to use and how to configure/set it up?

That would besuboptimal, as Gemini has too old knowledge cutoff. I am long past the need for such an advice anyway, as I've been using local models since mid 2024.

link