| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by toyg 9 days ago
	I've been playing with local models for some time, and I've been pleasantly surprised of late. A meager rtx 5080 with 16gb can give pretty good results now. The ecosystem is also improving pretty quickly. I have a feeling at some point we will have a "Windows 95" moment (when computing really became personal for the masses) in AI, and things will significantly change shape again.

1 comments

josephg 9 days ago

What local model do you recommend these days? I’ve got a 4090, mostly sitting idle.

link

kennywinker 8 days ago

The answer to which ai model, in mid 2026, is always qwen. Depending on your ram, it’s qwen3.5-9b, qwen3.6-35b-a3 in a 3 or 4 bit quant, or qwen3.6-27b. I’m told a bigger model quantized is better than a smaller model unquantized. In 16Gb vram on 10 year old hardware i can run a 3bit quant of qwen3.6-35b-a3 at ~30tokens/sec, and it can do a lot.

link

toyg 9 days ago

qwen 3.5 with 9b is being a pretty decent workhorse for me, even with context around 4k.

link