Hacker News new | ask | show | jobs
by awelkie 2046 days ago
Exactly. I wrote this explanation, but you beat me to it, so I'll just post it here:

The real reason we use I/Q sampling is because we want to frequency-shift a signal.

Why do we want to frequency-shift a signal? In radio frequency applications the signal of interest almost always has a much lower bandwidth than its highest frequency. In other words, the signal has a small bandwidth (say 40 MHz) centered around a high center-frequency (say 2.4 GHz). If we want to digitize the signal, then one way would be to use a very high sample-rate ADC (e.g. a 2.4 GHz ADC). But these are very expensive, and a much better way of digitizing the signal is to use a mixer (a frequency shifter) to shift the signal to be centered around 0 Hz and then use a relatively low sample-rate ADC (e.g. a 40 MHz ADC).

The way frequency shifting is done is by multipling the signal by a sine signal, which can be done in hardware. But this introduces a distortion to the signal because multiplying by a sine is not actually a frequency shift. It just so happens that this distortion is cancelled out by adding another copy of the signal multiplied with another sine delayed by 90°. But this addition needs to be complex (due to the relationship between sine functions and true frequency shifts), so what we do is sample the two distorted signals and do this complex addition with the digitial signals.

So the reason we have complex samples is because that's the best way we've found to do frequency shifting using real-only sine waves (this explains why we don't use complex numbers in audio signal processing; there's no need to do frequency shifting!). This tutorial goes into the details and is the best explanation I've seen on quadrature sampling (another term for I/Q sampling): https://www.dsprelated.com/showarticle/192.php

I think engineers (myself included) tend to get confused because using complex numbers makes the math simpler, and so they think that's the real reason we use them. All the talk about ambiguous frequencies or negative frequencies or needing to know the phase of a sample is true, but all of those problems could be solved without complex numbers simply by sampling twice as fast and then doing some math (again, audio DSP does just fine without quadrature sampling), so it's not a "real" reason to do this strange kind of sampling.

3 comments

> Exactly. I wrote this explanation, but you beat me to it, so I'll just post it here:

> The real reason we use I/Q sampling is because we want to frequency-shift a signal.

> Why do we want to frequency-shift a signal? In radio frequency applications the signal of interest almost always has a much lower bandwidth than its highest frequency. In other words, the signal has a small bandwidth (say 40 MHz) centered around a high center-frequency (say 2.4 GHz). If we want to digitize the signal, then one way would be to use a very high sample-rate ADC (e.g. a 2.4 GHz ADC). But these are very expensive, and a much better way of digitizing the signal is to use a mixer (a frequency shifter) to shift the signal to be centered around 0 Hz and then use a relatively low sample-rate ADC (e.g. a 40 MHz ADC).

> The way frequency shifting is done is by multipling the signal by a sine signal, which can be done in hardware. But this introduces a distortion to the signal because multiplying by a sine is not actually a frequency shift. It just so happens that this distortion is cancelled out by adding another copy of the signal multiplied with another sine delayed by 90°. But this addition needs to be complex (due to the relationship between sine functions and true frequency shifts), so what we do is sample the two distorted signals and do this complex addition with the digitial signals.

I'm not sure I understand you correctly, but I would not say you distort the signal when you multiply with a sine wave. Essentially you create to frequency components the sum and difference frequencies (f1+f2, f1-f2), now if f1 is your modulated signal (so some f1+fmod, where fmod is a band and can be positive and negative) and you want to convert to baseband you would select f2 so that it's at the carrier (f1=f2) then you generate a baseband signal at 0 carrier frequency and a signal at 2xf1 which is usually outside your detector bandwidth so not detected. However this process only gives you half of the frequencies of your fmod, to get the other half you need to multiply with cosine(f2) which essentially gives you the component that was at 2xf1 now at baseband. So to handle that more elegantly in math you add the two components up as real and imaginary components, essentially that enables you to drop the cos/sin(f1) terms from your equations.

The reason why audio processing (not sampling) does fine without I/Q data is because our ears are almost completely insensitive to the phase relationships between different frequency components, and because additive frequency shifts are not musically useful. That is what is very hard to deal with without representing signals as I/Q. The audio world just doesn't care. Radio does. This is why most textbook audio equalizers (including those used in professional DAWs) have nonlinear phase by default (minimum-phase) unless you opt for a FIR or FFT based mode. That would never fly in radio.
That's not really the reason. It's important that the concept of phase is not something absolute, it only makes sense in relation to something. Absolute phase could be defined from the beginning of the universe which is nonsensical .

You are right nobody can hear phase, but nobody can see phase either again because you need to relate (interfere) to something. However it does make a difference if we think about the superposition (interference) of different audio frequency components. We would definitely here some of those phase differences.

That said iq does not make sense in audio processing because it's baseband. There is no carrier wave.

We do not hear phase differences in the relative phase of different audio frequency components. Try it for yourself. Run a song through an allpass filter. It'll sound the same. In fact, speaker systems of all kinds do crazy things to the phase of signals, and nobody cares (what we care about is frequency and transient response).

The same is not true for radio. There, corrupting the phase relationships corrupts the data (for many systems).

Phase is relative, but our ears don't care about relative phase either (at least as long as you don't stick nonlinear filters after, then it starts mattering, but usually in audio things are fairly decorrelated anyway so it only matters in quite specific cases).

Here is an example: https://twitter.com/marcan42/status/1282685645731672064

Demo: https://twitter.com/zwegner/status/1282859889447116809 (interestingly, you can hear the change in Twitter's low quality encode, but it goes away at higher qualities, so it seems their crappy AAC encoder does care about relative phase :-))

It's counterintuitive how little our ears care about phase across frequency bands. This is not true for other kinds of signals.

I just tried this and you're right, complete random phase across the the whole frequency band does not noticeably change things. Funny that we learned that differently. Thanks I learnt something about audio today :).
afaik phase only really matters in audio for speaker enclosure design, placement, and likewise microphones.

However, because its virtually unknown to audio folks theres perceptible nodes everywhere, if you can hear them.

Audio processing isn't shifted downto baseband or shifted at all, so there is no need for IQ. Its all real. If instead of a direct mix down to baseband, you tell the sdr to mix the minimum frequency in the signal you care about down to just above zero, you can work without i and q. For instance, if you mix an am radio freq down to audio frequency, its all real and you can hear it and represent it as an array of real values.

Edit, this is how the Airspy sdr works, to avoid iq imbalance like you get in the direct conversion receivers in most sdrs.

Second edit for terminology. Mixing is multiplying by a frquency to shift frequency. Baseband means you shifted the center of the frquencies you care about to zero, so half of the frequency content is negative. Negative frequencies are what drive that mean imaginary number into the whole thing.

You're making the mistake of assuming that the only purpose of IQ data is to represent negative frequencies after downconversion. This is not true. The IQ representation is extremely useful for certain kinds of processing, even if you're working in baseband. There are plenty of reasons to take a real baseband signal, run it through a Hilbert transform to get a Q, and process it as IQ data.

It just so happens that audio DSP algorithms happen to almost never care about those exact kinds of processing, due to the way our ears and brains work. And thus, IQ data is not used in audio. But it's not because it's baseband. It's because our ears don't care about phase relationships (which is one thing you can more easily preserve in the IQ domain) and because frequency shifts like downconversion are not useful in music since they destroy the harmonic relationships in the sound.

I wouldn't have used the term real and baseband together, but I think I understand what you mean. I've been frustrated when people describe a modulation real when they could have deacrbed it more elegantly complex. With modern floating point registers being so large the phase loss is less important, but sometimes the representation just makes more sense symmetrical around zero (DC). Could you explain what you mean by harmonic relationships in sound? Does that imply AM will destroy some quality of the music even if you used a 22khz wide band?
I mean if you add 10Hz to all frequency components in audio, what used to be harmonics (rational multiples of the fundamental frequency) stop being harmonics and it sounds like a dissonant mess. There is no reason to ever frequency-shift music/audio by an offset (i.e. the same thing modulation does in radio, or multiplying by a carrier in the IQ time domain). The only frequency shifting we do in audio is by multiplying the frequencies (that's resampling in the time domain), which is a different story.

100,200,400Hz is a consonant tone, while 110,210,410Hz is a dissonant mess

AM doesn't have this problem because it has symmetric sidebands and a carrier (so a tuning offset does not result in audio frequency shift), but SSB does. If you listen to an SSB transmission without your tuning being perfect, it sounds horrible. Voice sounds distorted, and music is hideous. I'm having trouble finding an example of the latter, probably because nobody dares put music through SSB :-) (but you can do this easily enough in gnuradio by upconverting a song with a 10Hz offset, for example)

Also, thank you for taking the time to educate me. I didn't take any signal processing classes in my EE degree, so I learned everything on the job and have gaps. How does autotune not sound horrible, if they do it right it is indistinguishable, or so I have read.
I've heard what you are talking about in SSB, I didn't know what it was. I don't quite understand the AM thing, is it that a tuning offset would grab the image of the other sideband to correct stuff?
Yes, our ears hear power not amplitude so phase isn't so important... except maybe for strange mixing products and reflections off of walls? OK the Audiophiles can hear some of that, but generally it's true if you're listening through headphones.
We do hear relative phase relationships at any given frequency between both ears. So if you phase shift one side of a stereo signal and not the other, then yes, that is very audible.

But nodes and mixing products are independent of overall phase across the power spectrum, in a linear system. So if you apply the same phase change to both left and right, the distribution of nodes in the room won't change. The only time these inter-frequency phase relationships start to matter is when you introduce nonlinearities, like distortion.

Yes, directional hearing is quite sensitive to phase, but there are often multi-reflections inside the outer ear that allow some people to hear phase discontinuities in mono.
Anyone can hear phase discontinuities because any phase discontinuity is just a burst of high frequency content.

But typical reflections off of surfaces are largely linear as far as I know, and any linear operation will not introduce any power spectrum changes that are phase dependent. As far as I know, the the ear canal can be largely modeled as a linear system (to within the thresholds of hearability).

The only way to hear phase is to introduce a nonlinearity. That then generates harmonics (or sometimes even lower frequencies), and their power spectrum depends on the specific phase relationships of the incoming signal.

A physical example of a nonlinearity would be a vibrating surface that hits another surface at a certain excursion. Depending on the relative phases of the excitation signal, you can have different peak excursion, and therefore clearly get a different result if one phase set makes it reach the other surface and another one doesn't.

I'd add that the other reason for using I/Q in heterodyned demodulation is for SSB (Single Side Band) to reduce out of band interferers. Otherwise both Fmix +/- Ffilt will get through. Usually, transmitters are also SSB to reduce power (unless they are baseband direct modulation).