| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cess11 387 days ago
	I don't know what counts as a major model. Relevant to this, I've dabbled with Gemma, Qwen, Mistral, Llama, Granite and Phi models, mostly 3-14b varieties but also some larger ones on CPU on a machine that has 64 GB RAM.

1 comments

wild_egg 387 days ago

I think the issue there is those smaller versions of those models. I regularly use Gemma3 and Qwen3 for programming without issue but in the 27b-32b range. Going smaller than that generally yields garbage.

link

cess11 386 days ago

I've tried 24-32b sizes as well and besides being even slower they were also unreliable.

link