Hacker News new | ask | show | jobs
by duped 704 days ago
> Basically every application that uses microphone input will want to do this

The OS doesn't have more information about this than applications and it's not that obvious whether an application wants the OS to fuck around with the audio input it sees. Even in the applications where this might be the obvious default behavior, you're wrong - since most listeners don't use loudspeakers at all, and this is not a problem when they wear headphones. And detecting that (also, is the input a microphone at all?) is not straightforward.

Not all audio applications are phone calls.

3 comments

>The OS doesn't have more information about this than applications

the OP pointed out that this only works if he uses a browser monoculture

the OS does have more information than that, it can know what is being played by any/all apps, and what is being picked up by the mic

The "OS" isn't special here, apps can listen to system audio.

fwiw, you only need to know anything about outputs if you are doing AEC. Blind source separation doesn't have that problem and can just process the input stream.

> The "OS" isn't special here, apps can listen to system audio.

Even if this is true, it's easy to imagine such functionality being exploited by malicious apps as a security and/or privacy concern, particularly if the user needs a screen reader.

It definitely makes sense for the operating system to provide this functionality.

The OS can have multiple sound input devices for the application to choose from, "raw" and "fuckarounded with"
That doesn't make sense in the context of default devices. MacOS's AVKit (or is it CoreAudio?) APIs that configure the streams created on the device makes way more sense, since it's a property of the audio i/o stream and not the devices.
Assuming this isn't parody, the OS doesn't have to do it automatically. Having an application grab a microphone stream and say to the OS "take this and cancel any audio out streams" might be pretty useful.
I agree with that, but the point I'm trying to make is that audio i/o handling is pretty complicated and application specific. The idea I'm challenging is that "any app that wants microphone input wants this" is dubious. I'd say it's only a small number of audio applications that care about mic input want background noise reduced - and it makes sense for this to be configured per-input stream.

Really what would be nice is if every audio i/o backend supported multiplex i/o streams and you could configure whether or not to cancel audio based on that set of streams but not all output (because multi output-device audio gets tricky).

I'm honestly having trouble thinking of a case where I wouldn't want this.

I'm sure there are some niche cases, but in those cases, the application can specifically request that the OS turn off audio isolation.

The technique introduces latency and distortion because it's subtracting an estimate of sound that's traveling/reflecting in the listening environment, which is imperfect and involves the speed of sound.

That latency is within the tolerance that users are comfortable with for voice chat, and much less than video processing/transfer is introducing for video calls anyway, so it's a very obvious win there. Especially since those users are most interested in just picking out clear words using whatever random mic/speaker configuration happens to be most convenient.

But musicians, for instance, are much more interested in minimizing the delay between their voice or instrument being captured and returned through a monitor, and they generally choose a hardware arrangement that avoids the problem in the first place. And that's not really a niche use case.

Live video or audio chat is basically the only time you do want this. Granted, that’s a big chunk of microphone usage in practice, but any time you are doing higher fidelity audio recording and you have set up the inputs accordingly you absolutely do not want the artefacts introduced by this cancellation. DAWs, audio calibration, and even live audio when you’ve ensured the output cannot impact the recording all would want it switched off.

Default on vs default off is really just an implementation detail of the API though, as you say.

> Live video or audio chat is basically the only time you do want this.

If I'm recording a voice memo, or talking to an AI assistant, I would want this. Basically everything I can imagine doing with a PC microphone outside of (!) professional audio recording work.

That last case is important and we agree there needs to be a way to turn it off. I think defaults are really important though.

My colleague works in a very quiet house, and has no need for noise cancelling. Sometimes, he has it turned on by accident, and the quality is much worse - his voice gets cut out, and volume of his voice goes up and down.

As you say, as long as either option is available, the only question is what the default should be.

I gave an example, when I'm wearing headphones I don't want this enabled. If I'm recording anything, I probably don't want it on either. If I'm using a virtual output, I don't want AEC to treat that as a loudspeaker.
Every normal application already does it through the os because most do not care about this at all.

Music player, browser, games, video player...

Audio is not app specific

The only application were this is true is audio were you want full control and low latency.

I find your take very weird.