| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by DougBTX 875 days ago

Nice graphs here: https://github.com/ggerganov/llama.cpp/pull/1684

So for example, 2 bit version of the 30B is much worse than the original, but still better than the 13B model.

Also, there are lots of extra details, eg, not all of the weights are 2 bit, and even the 2 bit weights are higher than that overall as groups of quantised weights share scale factors stored elsewhere.