| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Taek 1201 days ago
	For completeness, there's also another paper that demonstrated you get more power/accuracy per-bit at 4 bits than at any other level of precision (including 2 bits and 3 bits)

1 comments

MacsHeadroom 1200 days ago

That's the paper I referenced. But newer research is already challenging it.

'Int-4 llama is not enough [0] - Int-3 and beyond' suggests 3-bit is best for models larger than ~10B parameters when combining binning and GPTQ.

[0] https://nolanoorg.substack.com/p/int-4-llama-is-not-enough-i...