| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kennywinker 13 days ago
	You can create anything you need as long as what you need is a disposable script, a scotch-taped together single page app, or a complex problem and you have thousands of dollars to throw at tokens.

1 comments

toyg 13 days ago

I've been playing with local models for some time, and I've been pleasantly surprised of late. A meager rtx 5080 with 16gb can give pretty good results now. The ecosystem is also improving pretty quickly.

I have a feeling at some point we will have a "Windows 95" moment (when computing really became personal for the masses) in AI, and things will significantly change shape again.

link

josephg 13 days ago

What local model do you recommend these days? I’ve got a 4090, mostly sitting idle.

link

kennywinker 12 days ago

The answer to which ai model, in mid 2026, is always qwen. Depending on your ram, it’s qwen3.5-9b, qwen3.6-35b-a3 in a 3 or 4 bit quant, or qwen3.6-27b. I’m told a bigger model quantized is better than a smaller model unquantized. In 16Gb vram on 10 year old hardware i can run a 3bit quant of qwen3.6-35b-a3 at ~30tokens/sec, and it can do a lot.

link

toyg 13 days ago

qwen 3.5 with 9b is being a pretty decent workhorse for me, even with context around 4k.

link