|
|
|
|
|
by MacsHeadroom
1136 days ago
|
|
The latest release of bitsandbytes uses a new fp4 format. 4bit floating point scailing results in much lower perplexity than int4. Also note that for a fixed memory (RAM) size, 4bit (even int4) is always superior, resulting in lower perplexity than 8bit. E.g. LLaMA-13B int4 is far better/lower perplexity than LLaMA-7B fp8 while using the same amount of RAM. |
|