|
|
|
|
|
by corysama
488 days ago
|
|
I'm pretty far from an expert. But, at it's core ML is a bunch of matrix multiplications glued together with non-linear functions. So, quantization leads to less accuracy in the matrices of weights. Not, changes in hashes where 1 wrong bit is meaningless. The folks who quantized DeepSeek say they used a piece of tech called "BitsAndBytes". https://unsloth.ai/blog/dynamic-4bit Googling around for "bitsandbytes ai quantization" turns up this article which looks nice https://generativeai.pub/practical-guide-of-llm-quantization... |
|