|
|
|
|
|
by touisteur
554 days ago
|
|
Talking long convolutions, on parallel architectures, through FFT there's a lot of performance to gain from Overlap-and-Add or Overlap-and-Save schemes, especially on GPUs with cuFFTDx which brings a whole new set of device primitives to the table and looking at (or getting inspiration from) tcFFT which allows using Tensor Cores for FFT and actually increases throughput on lots of convolution workloads. |
|
I'm curious if there are other methods for approximating long convolutions that are well-known or widely-used, outside of overlap-add and overlap-save? I'm in the audio field and interested in learning long _FIR_ filters to describe the resonances of physical objects, like instruments, or rooms. Block-coding, or fixed-frame size approaches reign supreme, of course, but have their own issues in terms of windowing artifacts, etc.
I'm definitely aware that multiplication in the (complex) frequency domain is equivalent to convolution in the time domain and that, because of the fast-fourier transform, this can yield increased efficiency. However, this still results in storing a lot of gradient information that my intuition tells me (possibly incorrectly) is full of redundancy and waste.
Stateful, IIR, or auto-regressive approaches are _one_ obvious answer, but this changes the game in terms of training and inference parallelization.
A couple ideas I've considered, but have not yet tried, or looked too deeply into:
- First performing PCA in the complex frequency domain, reducing the point-wise multiplication that must occur. Without some additional normalization up-front, it's likely this would be equivalent to downsampling/low-pass filtering and performing the convolution there. The learnable filter bank would live in the PCA space, reducing the overall number of learned parameters.
- A Compressed Sensing inspired approach, where we perform a sparse, sub-sampled random set of points from both signals and recover the full result based on the assumption that both convolver and convolvee? are sparse in the fourier domain. This one is pretty half-baked.
I'd love to hear about papers you've read, or thoughts you've had about this problem.