| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Ambix 471 days ago
	I did my own experiments and it looks like (surprisingly) Q4KM models often outperforms Q6 and Q8 quantised models. For bigger models (in range of 8B - 70B) the Q4KM is very good, there are no any degradation compared to full FP16 models.