Hacker News new | ask | show | jobs
by Reubend 55 days ago
It's not just quantization. I verified that if you naïvely quantize to 1 bit from the original Qwen model (and set grouped scale factors based on what the original model's weights were like), it just spits out gibberish.

> One thought that suggests rearranging is not involved,a thought that does not require any knowledge at all: if it did involve rearranging, someone would certainly have added some order by scale factor tricks with linear interpolation by address offset to lose even less precision.

Can you elaborate?