|
|
|
|
|
by Avisite
849 days ago
|
|
Does quantization need to be an all or nothing? with the kind of low bit models we have seen, my assumption would be that only certain weights would benefit from the extra precision. A mixture of precision with 2-bit, 3-bit, to 8-bit weights might perform well, but I am unsure if any training process could identify the weights that need the extra precision. |
|
So moving to extremely high efficiency native ternary hardware like with optics is going to be a much better result than trying to mix precision in classical hardware.
We'll see, but this is one of those things that I wouldn't have expected to be true but as soon as I see that it is it kind of makes sense. If it holds up (and it probably will) it's going to kick off a hardware revolution in AI.