Hacker News new | ask | show | jobs
by staunton 1157 days ago
> a cursory search suggests it provides the expected speedup

What do you mean? Processing a CNN layer takes an amount of time that does not depend on the input data, only the input/output sizes. Fourier transform is just a change of basis. Why should anything speed up?

1 comments

Because convolution is an O(N^2) operation, but a Fourier transform (and its inverse) can be done in O(NlogN) and turns convolution into O(N) multiplication. So if you do FFT, multiply, Inverse FFT, you get convolution in O(NlogN). I would guess that you don't even need to do the inverse FFT and can just learn in frequency space instead, but maybe there's some reason why that doesn't work out.
Computing convolutions using FFTs is efficient for large kernels (or filters). Most convolutions in popular ML models have small kernels, a regime where it is typically more efficient to reformulate the convolution as a matrix multiplication.

I think your complexity argument is correct for N=pixels=kernel size. But typically, pixels>>kernel size.

Disclosure: I work at Arm optimising open source ML frameworks. Opinions are my own.

I see, the wording confused me. You don't really "put Fourier transforms around the convolutional layers" because you have to completely replace the convolutions.

This seems to be done in some cases. I guess it isn't done more widely because the "standard" convolution kernels are very small and the performance would actually be worse?

Could you point to some actual examples of using FFT for NNs in real projects / GitHub codebases? Would like to dig more into the thing.