| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kethinov 15 days ago
	qwen3-coder-next runs fine on my consumer grade nvidia 4070. Performance is not spectacular, but it's only a little bit slower than a properly-fit model.

1 comments

sonzohan 15 days ago

What are your settings and tokens/second? Even with 2 GPUs (MI100, RX 6600 XT 8GB) and 32GB of RAM it was running at a snails pace for me.

I didn't try a sched_spread with a 3090 and the MI100 which would provide 56GB ram

link

kethinov 15 days ago

It's not speedy. I get 1-3 tokens per second.

The machine:

CPU: 24 × AMD Ryzen 9 9900X 12-Core Processor

RAM: 128gb

GPU: NVIDIA GeForce RTX 4060 Ti 16gb (I typo'd the GPU above)

(This is via Ollama on Ubuntu.)

But 1-3 tokens per second is much faster than a lot of other high end models I've tried, so I was pretty pleased with it. Obviously other models run much faster on this hardware though.

link