| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by RandomBK 501 days ago
	The only 32B distill I'm aware of is `DeepSeek-R1-Distill-Qwen-32B`, which would be a base model of `Qwen-32B` distilled (further trained) on outputs from the full R1 model.

1 comments

GP is likely running the 4-bit quantized version of the finetuned Qwen model.