|
|
|
|
|
by londons_explore
1086 days ago
|
|
Their best performing 4-bit number format uses 1 sign bit, 3 exponent bits, and no mantissa bits! Ie. All weights, activations and gradients become powers of two! Which means all multiplications become simple bit shifts. That really changes mathematics and silicon design. |
|
You're usually feeding a ton of multiplies into an accumulator. You can handle one or two mantissa bits as the same bit shifting except that it outputs two or three numbers to accumulate. And accumulators are very easy to scale.
Also in the extreme I've seen powers of 4 get used.