Y
Hacker News
new
|
ask
|
show
|
jobs
by
sebzim4500
1103 days ago
Yes, but to my knowledge it doesn't do any of the complicated optimization stuff that SOTA quantisation methods use. It basically is just doing a bunch of rounding.
There are advantages to simplicity, after all.
1 comments
brucethemoose2
1103 days ago
Its not so simple anymore, see
https://github.com/ggerganov/llama.cpp/pull/1684
link