|
|
|
|
|
by easde
2052 days ago
|
|
It's true that depthwise convolutions are bandwidth bound, but most networks that use them in combination with a kernel size 1 "convolution". If those two operations are tiled and fused together, the result is often compute bound again. This is usually not the case in most ML frameworks and libraries though, including CuDNN. |
|