| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by easde 2052 days ago
	It's true that depthwise convolutions are bandwidth bound, but most networks that use them in combination with a kernel size 1 "convolution". If those two operations are tiled and fused together, the result is often compute bound again. This is usually not the case in most ML frameworks and libraries though, including CuDNN.