|
|
|
|
|
by _yosefk
2782 days ago
|
|
They don't show a comparison to bfloat16 PEs/FMA. IEEE half precision uses a larger mantissa than bfloat16, and the cost of multiplication is proportionate to the square of the mantissa size. I'd expect much lower gains relatively to bfloat16 |
|