| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by deepnotderp 1785 days ago
	V100 GPUs have non tensor core fp16 operations too I think

2 comments

woadwarrior01 1785 days ago

Yes. Non tensor core fp16 ops are the default. Tensor cores are essentially 4x4 fp16 mac units and there's a requirement that matrix dimensions are multiples of 8[1] that needs to be met for them to be used.

[1]: https://docs.nvidia.com/deeplearning/performance/mixed-preci...

link

ml_hardware 1785 days ago

That's true.. in fact, seeing V100 FP16 < T4 FP16 makes me believe you're right, the V100 should be much faster if the tensor cores were being used.

link