Hacker News new | ask | show | jobs
by walterbell 409 days ago
What's the minimum GPU/NPU hardware and memory to run Qwen3 locally?
5 comments

There is a 0.6B model so basically nothing.

And the MoE 30B one has a decent shot at running OK without GPU. I'm on a 5800x3d so two generations old and its still very usable

I'm running 4B on my 8GB AMD 7600 via ollama
`model.safetensors` for Qwen3-0.6B is a single 1.5GB file.

Qwen3-235B-A22B has 118 `.safetensors` files at 4GB each.

There are a bunch of models and quants between those.

Does it run in 8x80G? Or does the KV cache and other buffers push it over the edge?
Qwen3 is a family of models, the very smallest are only a few GB and will run comfortably on virtually any computer of the last 10 years or recent-ish smart phone. The largest - well, depends how fast you want it to run.
There are models down to 0.6B and you can even run Qwen3 30B-A3B reasonably fast on CPU only.