Hacker News new | ask | show | jobs
by v3ss0n 352 days ago
That is true too. But I found Qwen3 14B with 8bit quant fair better than 32B with 4b quant . Both kvcache at 8bit. ( i enabled thinking , i will try with /nothink)