Hacker News new | ask | show | jobs
by stolio 4187 days ago
I thought the bear picture was maybe the best part of the article. Compression is tricky, multiband even more so, and the type of multiband compression they're using is even harder to wrap your head around (there are two ways (that I know of) to do what they're doing but I doubt they want to talk about which one they use.)

It's a great layman's explanation, but if you have a better one I'd love to see it.

1 comments

I think the point of the bear picture isn't that you can't see half the bear, but rather that (to continue the visual analogy) you can't see half of the bear's colors. How do you fix something that you can't perceive? You can change the missing colors to something that you can see, but you end up distorting the original image.

In the soundcloud samples, if I can't hear anything above a certain frequency, making them louder isn't going to help. You can drop the frequency of those things, but my guess is that it's going to sound pretty ugly. It would be interesting to listen to a sample that has everything above a certain frequency pitch-shifted downwards.

Here's how I took it: if you're just losing sensitivity at a particular frequency then you may only hear sounds in the 40-100dB range, below it's too quiet to register and above it's painful. That's a lot of information to lose but you can smash the 1-100dB range into the 40-100dB range. If you choose to you could even smash the 1-60dB range into the 40-60dB range (or pick whatever numbers) and leave everything above that relatively untouched. This is a fairly common sound engineering technique to fill out a sound without destroying its dynamics.

So if you picture a scale next to the bear picture from 1-100, then the bottom part of the bear is what's beneath the (effective) noise floor for that frequency. To extend the analogy to multiband compression you'd have maybe 10 bears next to each other, each missing different amounts and each needing a slightly different smashing to lift the bottom of the picture into the visible range.

edit: I think people are assuming that the frequency content of the bear picture corresponds to the frequency content of sound (they're all signals, right?) but to me it's a much more basic analogy. To do it that way you'd have to be turning up the soft reds or something to that effect, but rods and cones being what they are we don't lose vision in a comparable way to how we lose hearing so I don't think there's a good, intuitive visual analog in that sense.