|
|
|
|
|
by jessejengel
2350 days ago
|
|
Hi I'm Jesse, one of the authors, thanks for the interesting questions! - In terms fo the FIRs, I think you can think of this as a form of more general/nonlinear filter modeling. The difference being I think that you can have a filter as one of several components, and adapt them all jointly to achieve some task (which itself can be more flexibly defined (different losses, adversarial etc.). The filter itself is still just LTV-FIR, but it's being controlled nonlinearly. We only have examined synthesis so far, but other signal processing problems like denoising are definitely good directions. The "effects" processors are designed for this. - It's true neural networks often learned correlated parameters but it usually is of less significance because they operate in an overparameterized "interpolative" regime, which has a lot of interesting ongoing research trying to understand it. - We didn't do a quantitative comparison, but in general the tradeoffs will be different. Dereverberation by a modular generative model will only sound as good as the generative model itself, so artifacts will be from not modeling the source properly. However, if you learn a good model, the dereverberation should be essentially perfect (you can losslessly apply different reverb), although that's a big if. |
|
I do think you should investigate comparisons to adaptive FIRs much more. This field is critical to the design of low power medical devices like hearing aids, which need feedback reduction, echo cancellation, and the like with minimal filter orders.
My question on correlated parameters was a bit more abstract. Often in the design of classical audio signal processors for creative applications you find that the user space parameters can be correlated, which map to more design space parameters that are even more correlated, and down to implementation level parameters which are even more correlated. For example in a filter designed by frequency sampling, the adjacent bins of an FFT are highly correlated in their I/O and I was curious if you optimized a bit by taking a DCT or similar approach for reparameterization like you'd find in calculating MFCCs and the like. It's really tough to design ML approaches for creative signal processing that are better than traditional methods due to this nature, humans learn and adapt to correlations very quickly, machines not so much when dealing with oscillation and ripple. Many local extrema in the parameter space and all that.