|
|
|
|
|
by areddyyt
643 days ago
|
|
We don't achieve peak compression efficiency because more complex weight unpacking mechanisms kill throughput. To be more explicit, the weight matrix's values belong to the set of -1, 0, and 1. When using two bits to encode these weights, we are not effectively utilizing one possible state: 10 => 1,
01 => 0,
00 =>-1,
11 => ? I think selecting the optimal radix economy will have more of a play on custom silicon, where we can implement silicon and instructions to rapidly decompress weights or work with the compressed weights directly. |
|