Hacker News new | ask | show | jobs
by bippingchip 1178 days ago
But don't underestimate how much 'middleware' there is in Cuda/CuDNN, NCCL etc. A lot of it is not in the compiler, but much more in hand crafted, carefully optimised libraries.

As an example: there are so many low level ways to run a conv2D kernel on a SIMT machine, but CuDNN will pick the best option for you based on the card you have, and the tensor sizes you are using.