Hacker News new | ask | show | jobs
by MacsHeadroom 1136 days ago
The latest release of bitsandbytes uses a new fp4 format. 4bit floating point scailing results in much lower perplexity than int4.

Also note that for a fixed memory (RAM) size, 4bit (even int4) is always superior, resulting in lower perplexity than 8bit.

E.g. LLaMA-13B int4 is far better/lower perplexity than LLaMA-7B fp8 while using the same amount of RAM.