Y
Hacker News
new
|
ask
|
show
|
jobs
by
sottol
1189 days ago
They're using GTPQ -- here you go:
https://arxiv.org/abs/2210.17323
. The authors benchmarked two families of models over a wide range of numbers of params.
1 comments
ddren
1189 days ago
llama.cpp is using RTN at the moment.
link