|
|
|
|
|
by Reubend
55 days ago
|
|
It's not just quantization. I verified that if you naïvely quantize to 1 bit from the original Qwen model (and set grouped scale factors based on what the original model's weights were like), it just spits out gibberish. > One thought that suggests rearranging is not involved,a thought that does not require any knowledge at all: if it did involve rearranging, someone would certainly have added some order by scale factor tricks with linear interpolation by address offset to lose even less precision. Can you elaborate? |
|