Hacker News new | ask | show | jobs
by elmarhaussmann 3034 days ago
Where specified "fp16", the V100 benchmarks use the code from https://github.com/tensorflow/benchmarks/tree/master/scripts... with the flag --use_fp16=true which enables fp16 for some but not all Tensors.
1 comments

It's my understanding that fp16 (available on the previous generation P100) and mixed-precision (major innovation of V100) are different things and the speedup of TensorCores is entirely missing from this benchmark. Unlike the general purpose P100, the TPU is a heavily optimized chip built for Deep Learning, hence it's performance increase. However, the V100 is also heavily optimized for Deep Learning (arguably the first non-GPU chip) from NVIDIA. I'm in no position to defend NVIDIA here haha but it seems like the benchmark misses the point if this is indeed the case.
It was my understanding that the TensorFlow benchmarks do make use of TensorCores on the V100. We'll verify and update accordingly.