| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boroboro4 534 days ago
	Because INT4 quantized weights still use FP16 compute in most cases. Sometimes it's possible to use FP8/INT8 compute, and there is research to use INT4 compute, but it's rather rare.