| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sottol 1189 days ago
	They're using GTPQ -- here you go: https://arxiv.org/abs/2210.17323 . The authors benchmarked two families of models over a wide range of numbers of params.

1 comments

llama.cpp is using RTN at the moment.