Hacker News new | ask | show | jobs
by hereforphone 1659 days ago
> It computes FFT to decompose the sound into a set of A·cos(2πwt+φ) waves and drops the phase φ to align all cos waves together. This is known as the auto-correlation function (ACF).

How is simply dropping the phase transforming FFT into ACF (according to various definitions of ACF as shown here: https://en.wikipedia.org/wiki/Autocorrelation)?

2 comments

I might be mistaken but since the auto-correlation function is the inverse FFT of the power spectral density and power spectral density doesn't contain information about the phase. Thus, it's like dropping the phase and taking IFFT(|A|^2)
They're inverse operations but as far as I understand that's not the only thing going on with the ACF. Taking the phase out is just taking the real part of the FFT is it not? Regardless where's the correlation?
Wikipedia is great at obfuscating simple ideas in complex math. The "Efficient computation" explains the idea well, but it could be made even simpler. The amplitude squaring step drops the phase there.
The phase is dropped but isn't there more going on? Comparing the signal to itself at various lags? I don't see how just dropping the phase accomplishes that.
Not really. ACF is defined as a convolution of signal X with itself: XX. But FFT turns a convolution into a dot product: FFT[XX] = FFT[X]·FFT[X], or just |FFT[X]|². But what is this really? If X is a sum of A·cos(2πwt+φ) waves, then FFT[X] is a set of A·exp(iφ) complex numbers. What does |FFT[X]|² do? It turns those complex numbers into A². Inversing this FFT gives a sum of A²·cos(2πwt) waves, so in effect ACF has dropped the phases and squared amplitudes. This is also why ACF have this bright vertical line - this is cos(x) functions piling up together.
ACF has a lag argument correct? Also isn't the bright vertical line just the result of the fact that ACF at lag 0 is 1?
It does, but ACF[X] at 0 is the sum of X[i] squares, so when sound gets louder, ACF at 0 also gets higher.
You're probably sick of this conversation by now :). But at least in radio applications I think that acf[0] is normalized so that it's 1 (typically). And again the ACF is calculated at several lag arguments and the sum is used to build the final graph / array.

But you obviously know more about this than me, I'm just putting out what I know. Your paragraph above, I actually copied so I can study it a few times. So thanks.