|
|
|
|
|
by almostgotcaught
555 days ago
|
|
I don't understand this reasoning. Depending on the target (GPU/DSP/FPGA), going to frequency domain for convolution made some amount of sense when FFT primitives were highly optimized relative to conventional conv or matmul implementations. But now we're like 10 years into the software/hardware arms race and the conv/matmul kernels are just as highly optimized. In addition the hardware has adapted too. > using Tensor Cores for FFT Why would I do this when I could just directly use tensor cores for matmul...? We have MMA, WMMA, WGMMA, etc and they all target tensor cores explicitly. |
|