|
|
|
|
|
by woadwarrior01
48 days ago
|
|
> at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy That's a tall claim. By that measure, even NVIDIA's QAD, which is AFAIK is currently SOTA for 4-bit quantization (albeit NVFP4 instead of INT4) would be worse than Q4_K_M RTN quantization. :D https://arxiv.org/abs/2601.20088 |
|