Hacker News new | ask | show | jobs
by juancn 30 days ago
I'm running unsloth/Qwen3.6-35B-A3B-UD-Q8_K_XL on an M3 Max, 64GB at ~57 t/s with llama-server
1 comments

Prefill speed and 27B number?
Prefill is around ~600 t/s.

I don't remember what the 27B was, I tried a 27B with different quantization at some point for that one, but I settled on the 31B.