| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ndriscoll 1164 days ago
	Because convolution is an O(N^2) operation, but a Fourier transform (and its inverse) can be done in O(NlogN) and turns convolution into O(N) multiplication. So if you do FFT, multiply, Inverse FFT, you get convolution in O(NlogN). I would guess that you don't even need to do the inverse FFT and can just learn in frequency space instead, but maybe there's some reason why that doesn't work out.

3 comments

jondea 1164 days ago

Computing convolutions using FFTs is efficient for large kernels (or filters). Most convolutions in popular ML models have small kernels, a regime where it is typically more efficient to reformulate the convolution as a matrix multiplication.

I think your complexity argument is correct for N=pixels=kernel size. But typically, pixels>>kernel size.

Disclosure: I work at Arm optimising open source ML frameworks. Opinions are my own.

link

staunton 1164 days ago

I see, the wording confused me. You don't really "put Fourier transforms around the convolutional layers" because you have to completely replace the convolutions.

This seems to be done in some cases. I guess it isn't done more widely because the "standard" convolution kernels are very small and the performance would actually be worse?

link

Ambix 1164 days ago

Could you point to some actual examples of using FFT for NNs in real projects / GitHub codebases? Would like to dig more into the thing.

link