| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jasonjmcghee 283 days ago

I think the best models around right now that most people can fit some quantization on their computer if it's a apple silicon Mac or gaming PC would be:

For non-coding: Qwen3-30B-A3B-Instruct-2507 (or the thinking variant, depending on use case)

For coding: Qwen3-Coder-30B-A3B-Instruct

---

If you have a bit more vram, GLM-4.5-Air or the full GLM-4.5

1 comments

all2 283 days ago

Note that Qwen3 and Deepseek are hobbled in Ollama; they cannot use tools as the tool portion of the system prompt is missing.

Recommendation: use something else to run the model. Ollama is convenient, but insufficient for tool use for these models.

link

theshrike79 283 days ago

Could you give a recommendation that works instead of saying what doesn't work?

link

simonw 283 days ago

Try LM Studio or llama-server: https://simonwillison.net/2025/Aug/19/gpt-oss-with-llama-cpp...

link

all2 282 days ago

I would, but I haven't found a working solution.

link