|
|
|
|
|
by kd5bjo
2309 days ago
|
|
A spectrogram has time on one axis and frequency on the other, so the ultimate result is a multiplication in one dimension and a convolution in the other. It can be used to show things like when a note starts and stops in a piece of music, which is difficult in either purely-time or purely-frequency space. Also, it’s computationally intractable to individually train 2^N weights. What a CNN does instead is train a convolution kernel which is passed over the whole domain to produce the input for the next layer; by operating in frequency space, it’s considering the basis functions e^{j omega +- epsilon} instead of delta(x +- epsilon) |
|
>Also, it’s computationally intractable to individually train 2^N weights.
that's a good point - i'd forgotten for a moment (because i'm so used to cooley-tukey fft) that in principle getting the spectrum involves a matmul against the entire vector. which brings up a potentially interested question: can you get a DNN to simulate the cooley-tukey fft (stride permutations and all).