|
|
|
|
|
by hex4def6
704 days ago
|
|
Assuming OP is correct, your last sentence implies this isn't the solution being used. Additionally, many (citation needed) Youtube videos have people talking in them; this method wouldn't help with that. Isolating vocals in general is significantly more difficult than just relying on frequency range. Any instrument I can think of can generate notes that are squarely in the common range of a human (see: https://www.dbamfordmusic.com/frequency-range-of-instruments...) |
|
The initial question may be specific to the way one particular browser handles things to certain degree, but the comment was also trying to communicate that it can go beyond the browser and can actually be handled by the application. However, the microphone itself can also be participating at some level if it features noise suppression or some other enhancements.
The surprise about things being different when using a separate browser, come from assuming that any audio reaching the microphone should be processed equally if using FTs (or machine learning if applicable), so the audio source shouldn't matter.
References:
- https://www.nti-audio.com/en/support/know-how/fast-fourier-t...
- https://pseeth.github.io/public/papers/seetharaman_2dft_wasp...