Hacker News new | ask | show | jobs
by AndrewGYork 1878 days ago
One of the questions raised at the end of the article:

"Do we ever benefit from explicitly putting Fourier layers into our models?"

Has a simple, partial answer here:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.l...

"...multiplying a vector by the matrix returned by dft is mathematically equivalent to (but much less efficient than) the calculation performed by scipy.fft.fft."

fft-as-a-matrix-multiply is much slower to compute than a standard fft, especially on large input.

1 comments

You can do the linear parts of neural nets in the frequency domain, but AFAIK you can't do the nonlinearity, so you have to inverse transform back to the spatial domain for that. The nonlinearity is an absolutely essential part of pretty much every neural net layer, so there is no big win to be had unfortunately. For convolutional nets in particular there are other ways of going faster than a naive matrix multiply, e.g. winograd convolution.
But if your network contains a layer whose linear part performs an (approximate) DFT, you will get an efficiency gain by replacing it with an exact FFT.

You wouldn't want to use an FFT for most CNNs anyway because the kernels have very small support. Convolution with them is O(n) in the spatial domain as long as you recognize the sparsity.

Why can't you apply the nonlinear activation functions in the frequency domain? What's stopping this or making it not work?