Hacker News new | ask | show | jobs
by sp332 1115 days ago
The question is whether this step is actually doing the GPTQ optimized quantization, or simple truncation.
1 comments

This work introduces a new quantization scheme, NF4, for 4-bit NormalFloat, based on previous work on quantile quantization, so it's not a simple truncation, but it's also not a GPTQ-like optimization method. Figure 3 of the paper shows accuracy improvement of NF4 over FP4.