| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sander1095 106 days ago
	I sense that I don't really understand enough of your comment to know why this is important. I hope you can explain some things to me: - Why is Qwen's default "quantization" setup "bad" - Who is Unsloth? - Why is his format better? What gains does a better format give? What are the downsides of a bad format? - What is quantization? Granted, I can look up this myself, but I thought I'd ask for the full picture for other readers.

3 comments

danielhanchen 106 days ago

Oh hey - we're actually the 4th largest distributor of OSS AI models in GB downloads - see https://huggingface.co/unsloth

https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs is what might be helpful. You might have heard 1bit dynamic DeepSeek quants (we did that) - not all layers can be 1bit - important ones are in 8bit or 16bit, and we show it still works well.

link

dist-epoch 106 days ago

The default Qwen "quantization" is not "bad", it's "large".

Unsloth releases lower-quality versions of the model (Qwen in this case). Think about taking a 95% quality JPEG and converting it to a 40% quality JPEG.

Models are quantized to lower quality/size so they can run on cheaper/consumer GPUs.

link

danielhanchen 105 days ago

Love the JPEG analogy :)

link

est 106 days ago

hey you can do a bit research yourself and tell your results to us!

link