Hacker News new | ask | show | jobs
by gok 314 days ago
So this 4B dense model gets very similar performance to the 30B MoE variant with 7.5x smaller footprint.
1 comments

It gets similar performance to the old version of the 30B MoE model, but not the updated version. https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507
I still think that its still very commendable though.

I am running this beast on my dumb pc with no gpu, now we are talking!