| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by MacsHeadroom 1200 days ago

That's the paper I referenced. But newer research is already challenging it.

'Int-4 llama is not enough [0] - Int-3 and beyond' suggests 3-bit is best for models larger than ~10B parameters when combining binning and GPTQ.