Hacker News new | ask | show | jobs
by anschl 7 days ago
Nice benchmark, thanks! Which quants did you choose for the self hosted models?
1 comments

8-bit on that one (unsloth 8_K_XL). But, the next post compares all common quantizations of Qwen 3.6.

I have another coming in a day or so for Gemma 4 with the 4-bit QAT version, which is very surprising (in a good way, Gemma 4 is impressive for this task).