Hacker News new | ask | show | jobs
by posterboy 1288 days ago
So what is phase? From dabbling with waveforms in audio editors, sampling, and later learning a little bit about complex numbers, phase seems eventually equivalent to what would sound like changing pitch, modulating the frequency of a periodic signal.

The simplest demonstration of it is the doppler shift. But it's not at all that simple because moving relative to the source the sound pressure and thus the perceived loudness also change, distorting the wave form, thereby introducing resonant frequencies. Now imagine that the transducer is always moving, eg. a plucked string.

The ideal harmonic pendulum swings periodically, only losing attenuation. But the resonant transducer picks up reflections of its own signal, like coupled pendulums, which are intractable according to the three body problem.

On top of that, our hearing is fine tuned to voices and qualities of noise.

1 comments

Phase is the offset in time. The functions sin(θ) and sin(θ + c), for arbitrary real c, represent the same frequency signal; they are offset from each other horizontally by c, and that c is a phase difference. It has an interpretation as an angle, when the full cycle of the wave is regarded as degrees around a circle; and that's what I mean by rotating phase.

When you take a window of samples of a signal, and run the FFT on it, for every frequency bin, the calculation determines what is the amplitude and phase of the signal. If you have a frequency bin whose center is 200 Hz, and there is a 200 Hz signal, then what you get for that frequency bin is a complex number. The complex number's magnitude ("modulus") is the amplitude of that signal, and its angle ("argument"d) is the phase.

If the signal is exactly 200 Hz, and if the successive FFT windows move by a multiple of 1/200th of a second, then the phase will be the same in succcessive FFT windows.

But suppose that the signal is actually 201 Hz: a little faster. Then with each successive FFT window, the phase will not line up any more with the previous window; it will advance a little bit. We will see a rotating complex value: same modulus, but the angle advancing.

From how fast the angle advances relative to the time step between FFT windows, we can deduce that we are capturing a 201 Hz signal in that bin (on the hypothesis that we have a pure, periodic signal in there).

How is the phase determined in the frequency bin? It's basically a vector correlation: a dot product. The samples are a vector which is dot-producted with a complex unit vector. The complex unit vector in the 200 Hz bin is essentially a 200 Hz sine and cosine wave, rolled into a single vector with the help of complex numbers. Sine and cosine are 90 degrees apart in phase, so they form a rectilinear basis (coordinate system). The calculation projects the signal, expressing it as a sum of the sine and cosine vectors. How much of one versus the other is the phase. A signal that is 100% correlated with the sine will have a phase angle of 0 degrees or possibly 180. If it correlates with the cosine component, it will be 90 or 270. Or some mixture thereof.

Because a complex number is two real numbers rolled into one, it simplifies the calculation: instead of doing a dot product with a sine and cosine vector to separately correlate the signal to the two coordinate bases, the complex numbers do it in one dot product operation. When we go around the unit circle, each position on the circle is cos(θ) + isin(θ). These complex values values give us samples of both functions. Exactly such values are stuffed into the rows of the DFT matrix: complex values from the unit circle divided into equal divisions.

If you look here at the definition of the ω (omega) parameter:

https://en.wikipedia.org/wiki/DFT_matrix

It is the N-th complex root of unity. But what that really means is that it is a 1/Nth step of the way around the unit cicrcle. For instance if N happened to be 360, then ω is the complex number whose |ω| = 1 (unit vector), and whose modulus is 1 degree: one degree around the circle. The second row of the DFT matrix has 1, ω, ω², ω³, ... the second row represents the lowest frequency (after zero, which is the first row). It captures a single cycle of a sine and cosine waveform, in N samples. The values in that row step around the unit circle in the smallest increment, so they go around the circle exactly once. The subsequent rows go around the circle in skipped steps, yielding higher frequencies: 1, ω², ω⁴ for twice around the circle; 1, ω³, ω⁶ for three times, ... we get all the harmonics up to our N resolution.

> on the hypothesis that we have a pure, periodic signal in there

That pure sine wouldn't generate any artefacts. It would result in a 200Hz output from the AI if it throws the phase information out. You wouldn't hear a difference unless its an (aptly so called) complex signal. Eg. 200 and 201 Hz layered is an impure signal with a period below 1Hz, far outside the scope. Eventually the signals will cancel out completely. [1]

The important point is, I think, that FFT doesn't simply look at the offset aka phase. Rather, 201 Hz looks like a 200 Hz that is moving. So it encodes phase-shift in the delta of the offset between two windows. For a sum of 200 and 201 Hz it has to assume that the magnitude is also changing, which I find entirely counterintuitive.

From the mathematical perspective, this seems like a borring homework, far detached from accoustics. So, I don't know. The funny thing is that rotation is very real in the movement of strings. If the orbit in one point is elliptic, that's like two sinusoids at different magnitudes offset by some 90 degree, in a simplified model. But it has nearly infinite coupled points along its axis. As they exite each other, and each point has a different distance to the receiver, that's where phase shift happens.

> If you look here at the definition of the ω (omega) parameter

I wasn't going to make drone, but I will take a look.

1: https://graphtoy.com/?f1(x,t)=100*sin(x)&v1=true&f2(x,t)=100...