| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Saurabh_29 2479 days ago
	The main bottleneck is the data transfer speed between the GPU and the SMs. Also, using tensor core doesn't necessarily apply using half-precision as now NVIDIA supports single-precision operation in Tensorcore too.