| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kanemcgrath 103 days ago

I have been using Qwen3.5-35B-A3B a lot in local testing, and it is by far the most capable model that could fit on my machine. I think quantization technology has really upped its game around these models, and there were two quants that blew me away

Mudler APEX-I-Quality. then later I tried Byteshape Q3_K_S-3.40bpw

Both made claims that seemed too good to be true, but I couldn't find any traces of lobotomization doing long agent coding loops. with the byteshape quant I am up to 40+ t/s which is a speed that makes agents much more pleasant. On an rtx 3060 12GB and 32GB of system ram, I went from slamming all my available memory to having like 14GB to spare.

5 comments

Hugsun 103 days ago

Unfortunately, llama.cpp quantization technology has been stagnant for two years. The main quantization developer left or was kicked out of llama.cpp due to an attribution dispute. He created his own fork ik_llama.cpp where he has made multiple new and better quants.

unsloth and byteshape are just using and highlighting features that have been available the whole time. I am very invested in figuring out a solution to this dispute, or some way to get the new quants upstreamed.

link

kanemcgrath 103 days ago

Now that I have tried out on a few tasks, Qwen3.6 is a huge jump in capability. It can make improvements to a project that qwen3.5 always struggled with.

link

burgertea 102 days ago

Could you share more about your config? I've also got a 3060 12gb and 64gb of ram, but I've never got local models running well enough to be useful

link

edg5000 103 days ago

What can and what can't it do compared to Codex and CC?

link

mettamage 103 days ago

who do you compare it against qwen3.5 27b?

link

kanemcgrath 102 days ago

I haven't ran 27b that much because it only runs at like 2 tokens/sec on my computer.

link

jadbox 103 days ago

Which one is best?

link

kanemcgrath 103 days ago

I would say byteshape is smaller and faster, I can’t really notice a quality difference. But I haven’t used it as much as I only started using it a few days ago.

link