Hacker News new | ask | show | jobs
by londons_explore 1086 days ago
Their best performing 4-bit number format uses 1 sign bit, 3 exponent bits, and no mantissa bits!

Ie. All weights, activations and gradients become powers of two! Which means all multiplications become simple bit shifts. That really changes mathematics and silicon design.

1 comments

Does it really make much of a difference?

You're usually feeding a ton of multiplies into an accumulator. You can handle one or two mantissa bits as the same bit shifting except that it outputs two or three numbers to accumulate. And accumulators are very easy to scale.

Also in the extreme I've seen powers of 4 get used.