| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by _kb 1205 days ago

Had a great encounter with this recently!

In an environment I work there's multichannel audio recordings that are archived. The archival recordings all had a perfect 4kHz tone appearing, seemingly out of nowhere. This was happening on every channel, across every room, but only in one building. Nowhere else. Absolutely nothing of the sort showed up on live monitoring. The systems were all the same and yet this behaviour was consistent across all systems only at one location.

The full system was reviewed: from processing, recording, signal distribution, audio capture, and in room. Maybe there was a test gen that had accidentally deployed? Nope. Some odd bug in an echo canceller? Also no. Something weird with interference from lighting or power? Slim chance, but also no. Complete mystery.

When looking for acoustic sources there was an odd little blip on the RTA at 20kHz. This was traced back to a test tone emitted from the fire safety system (ultrasonic signal for continuous monitoring). It's inaudible to most people and will be filtered before any voice-to-text processing so no reason for concern. Anyway 20kHz is nowhere near 4kHz though so the search continued.

The dissimilarly of 20kHz and 4kHz is true, until you consider what happens in a non-bandwidth limited signal. The initial capture was taking place at a 48kHz sampling rate. It turns out the archival was downsampling to 24kHz, without applying an anti-aliasing filter. Without filtering, any frequency content above the Nyquist 'folds' back over the reproducible range. So in this case a clean 24kHz bandwidth signal with a little bit of inaudible ultrasonic background noise was being folded at 12kHz to create a very audible 4kHz tone.

It was essentially a capture the flag for signals nerds and a whole lot of fun to trace.

1 comments

spacechild1 1205 days ago

> It turns out the archival was downsampling to 24kHz

But... why?

link

InitialLastName 1205 days ago

In situations where you don't need the archival to be at "perfect reproduction" quality (including things like broadcast archives or recordings of voice comms) you can get by with a 12kHz maximum frequency without losing the essentials (especially clarity of voices). Many adults can't hear much past 12kHz anyway and most music and voice content doesn't have content past 10khz. You don't lose much, but you save half your file size by x2 downsampling.

link

Sesse__ 1205 days ago

I'd guess the “why” was “why on earth did they not have an antialiasing filter”, not “why did they downsample”. A good lowpass filter is easy to design, cheap to apply, and protects you from this kind of stuff.

link

InitialLastName 1205 days ago

I was working off the quote, but I can see some reasons that someone would decide not to AA filter. Depending on the context it might be reasonable to assume that the signal is band-limited anyway (talk-oriented radio especially is often low-pass filtered) and it's easy to miss that some point in the system can introduce an (inaudible to most humans) artifact. Those assumptions, along with the desire to avoid complexity (every step in the signal path is an opportunity for failure) could easily tip you to "just downsample".

I'd also emphasize how little most of the people involved in these systems care about the quality of the archive. If it's good enough to a) confirm there was signal on the channel and b) understand the voices involved, it's good enough to not worry about further.

link

kimburgess 1205 days ago

> I'd also emphasize how little most of the people involved in these systems care about the quality of the archive. If it's good enough to a) confirm there was signal on the channel and b) understand the voices involved, it's good enough to not worry about further.

This is uncomfortably accurate. I work with the capture side of these system and people in that space care deeply about the integrity of the signal, but have little concern for what it contains. Archival is the inverse: the information content of the signal is what's important, not the signal itself.

link