Hacker News new | ask | show | jobs
by naasking 85 days ago
Interesting idea, but I hope people just start switching to ParoQuant and eliminate basically all quantization errors relative to fp16/bf16 even going down to 4-bits:

https://github.com/z-lab/paroquant