Hacker News new | ask | show | jobs
by marcan_42 2046 days ago
You're making the mistake of assuming that the only purpose of IQ data is to represent negative frequencies after downconversion. This is not true. The IQ representation is extremely useful for certain kinds of processing, even if you're working in baseband. There are plenty of reasons to take a real baseband signal, run it through a Hilbert transform to get a Q, and process it as IQ data.

It just so happens that audio DSP algorithms happen to almost never care about those exact kinds of processing, due to the way our ears and brains work. And thus, IQ data is not used in audio. But it's not because it's baseband. It's because our ears don't care about phase relationships (which is one thing you can more easily preserve in the IQ domain) and because frequency shifts like downconversion are not useful in music since they destroy the harmonic relationships in the sound.

1 comments

I wouldn't have used the term real and baseband together, but I think I understand what you mean. I've been frustrated when people describe a modulation real when they could have deacrbed it more elegantly complex. With modern floating point registers being so large the phase loss is less important, but sometimes the representation just makes more sense symmetrical around zero (DC). Could you explain what you mean by harmonic relationships in sound? Does that imply AM will destroy some quality of the music even if you used a 22khz wide band?
I mean if you add 10Hz to all frequency components in audio, what used to be harmonics (rational multiples of the fundamental frequency) stop being harmonics and it sounds like a dissonant mess. There is no reason to ever frequency-shift music/audio by an offset (i.e. the same thing modulation does in radio, or multiplying by a carrier in the IQ time domain). The only frequency shifting we do in audio is by multiplying the frequencies (that's resampling in the time domain), which is a different story.

100,200,400Hz is a consonant tone, while 110,210,410Hz is a dissonant mess

AM doesn't have this problem because it has symmetric sidebands and a carrier (so a tuning offset does not result in audio frequency shift), but SSB does. If you listen to an SSB transmission without your tuning being perfect, it sounds horrible. Voice sounds distorted, and music is hideous. I'm having trouble finding an example of the latter, probably because nobody dares put music through SSB :-) (but you can do this easily enough in gnuradio by upconverting a song with a 10Hz offset, for example)

Also, thank you for taking the time to educate me. I didn't take any signal processing classes in my EE degree, so I learned everything on the job and have gaps. How does autotune not sound horrible, if they do it right it is indistinguishable, or so I have read.
Autotune works by resampling and doing time stretching (not sure if in the time or frequency domain, depends on the technology; there are many variant ways of doing this) in order to decouple pitch and duration to make adjustments, so it doesn't break harmonic relationships.

Audio time stretching (or equivalently, changing the pitch without changing time) is not a clearly defined process with a mathematical description (unlike plain resampling or modulation) but rather a semi-heuristic process that takes into account psychoacoustics. But yes, in practice, for small adjustments of a monophonic sample like a voice, modern algorithms sound really good.

I've heard what you are talking about in SSB, I didn't know what it was. I don't quite understand the AM thing, is it that a tuning offset would grab the image of the other sideband to correct stuff?
AM reception basically uses envelope tracking, so you don't really care about the carrier frequency. It's really just "how much power am I receiving". The tuning ends up defining the window of spectrum you average power over.

In the frequency domain, you could think of AM demodulation as computing the width (and phase!) between the carrier images on both sidebands. It doesn't matter if the signal is a bit off to the side, because the width will be the same. You have a mirror image which gives an absolute reference.

In the IQ domain, you look at the magnitude of the vectors, not their angle, so you don't care about the frequency.

In SSB you only have one sideband, and often no carrier at all, so there is no reference. You need to nail the frequency to get a proper signal out. And even then AIUI your phases will be random, though that doesn't matter for audio.