| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by claiir 426 days ago
	Yea they mention a “perplexity drop” relative to naive quantization, but that’s meaningless to me. > We reduce the perplexity drop by 54% (using llama.cpp perplexity evaluation) when quantizing down to Q4_0. Wish they showed benchmarks / added quantized versions to the arena! :>