Hacker News new | ask | show | jobs
by _yosefk 2782 days ago
They don't show a comparison to bfloat16 PEs/FMA. IEEE half precision uses a larger mantissa than bfloat16, and the cost of multiplication is proportionate to the square of the mantissa size. I'd expect much lower gains relatively to bfloat16