|
|
|
|
|
by brucethemoose2
1089 days ago
|
|
There is some overhead from the quantization, and right now the operations themself are sometimes done at higher precision than the weights in RAM. And widespread hardware 4 bit will take some time. If the HW makers started designing 4 bit silicon in 2022, then we are still years away. |
|