| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moffkalast 1102 days ago
	Very cool, btw it's not mentioned in the readme so I assume it's only for running full precision models or do quantized GGML/GPTQ/etc. also work with it?

1 comments

aarnphm 1102 days ago

Hi there, 8bit and 4bit is currently supported on main. GPTQ is working in progress, as well as GGML

link

nacs 1101 days ago

GPTQ support would be amazing (AutoGPTQ is an easy way to integrate GPTQ support - it's basically just importing autogptq and switching out 1 line in the model loading code).

link