| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by halJordan 69 days ago
	Quantization is an extraordinarily trivial process. Especially if you're doing it with llama.cpp (which unsloth obviously does). Qwen did release an fp8 version, which is a quantized version.