Hacker News new | ask | show | jobs
by twtw 2775 days ago
I would be extremely surprised if the motivation for putting bfloat16 in tensorflow was not the TPU. That first public commit was ~1.5 years before TPUv2 was announced at I/O, so it was almost certainly already in development.
1 comments

bfloat16 was first in DistBelief, so it actually predates TensorFlow and TPUs (I worked on both systems). IIRC the motivation was more about minimizing parameter exchange bandwidth for large-scale CPU clusters rather than minimizing memory bandwidth within accelerators, but the idea generalized.