| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bleke 2775 days ago
	Sorry for not in topic, did Intel calculate bonuses on hn karma (more officially impact)? I see this bf16 multiple times and it like authors dying for Christmas bonus.

1 comments

rbanffy 2775 days ago

To me it looks like a clever optimization. Same range as FP32, but half the size and less precise and can be converted back and forth by truncating and concatenating zeros.

Is anyone else using it?

link

staticfloat 2775 days ago

Google uses it on their TPUs [0]. If you're interested in how it would effect the numerical stability of an algorithm you want to use, there is a Julia package that makes prototyping linear algebra over this datatype pretty straightforward [1].

[0] https://cloud.google.com/tpu/docs/system-architecture

[1] https://github.com/JuliaComputing/BFloat16s.jl

link

scottlegrand2 2775 days ago

And Facebook is taking this even further. And while all these things are very cool, do not let ASIC designers claim they are barriers to entry for GPUs and CPUs. Whatever variants of this precision potpourri catch on are but a generation away from incarnation in general processors IMO...

https://code.fb.com/ai-research/floating-point-math/

link

marcyb5st 2775 days ago

Google's TPUs use them. But it has been for a year. I don't agree with the "new" or "Intel's" in the title.

link

masklinn 2775 days ago

And TPU uses them because Tensorflow uses them, it's been present since the first public commit: https://github.com/tensorflow/tensorflow/blob/f41959ccb2d9d4...

link

twtw 2775 days ago

I would be extremely surprised if the motivation for putting bfloat16 in tensorflow was not the TPU. That first public commit was ~1.5 years before TPUv2 was announced at I/O, so it was almost certainly already in development.

link

vrv 2775 days ago

bfloat16 was first in DistBelief, so it actually predates TensorFlow and TPUs (I worked on both systems). IIRC the motivation was more about minimizing parameter exchange bandwidth for large-scale CPU clusters rather than minimizing memory bandwidth within accelerators, but the idea generalized.

link

marcyb5st 2775 days ago

Thank you! I didn't know this. I thought they introduced them shortly after announcing TPU v1 in the 2016 (or 2017, can't remember) Google I/O.

link

shaklee3 2775 days ago

Why is it clever to change the mantissa and exponent size? I thought the clever ones were the nervana flexpoint which seemed at least partially novel. And it's interesting Intel isn't pushing that format given nervana's asic had it.

link