|
|
|
|
|
by davitb
2777 days ago
|
|
Disclosure: I'm the author of the blog post and co-founder at 2Hz. This is a guest post on NVIDIA Developer Blog. The author of the technology is a startup called 2Hz (2hz.ai). Our passion is to improve voice audio quality in audio/video calls. It's a tough problem but also fun to work on. Agree, breathing, reverb, noise are all problems and should be fixed. We started with noise and already shipped a product you can try on your Mac. The app is called Krisp (krisp.ai). Reverb, breathing, voice cutting will come next. |
|
Something struck me about the sample video. The very first sample included background noise, but it was very easy to understand regardless of the noise, probably because it was recorded by a pro microphone rather than a phone. Every other sample was far more difficult, regardless of noise removal. Noise removal doesn't really seem to help; in fact, any imperfections in the noise removal process actually make the audio more difficult to understand because I have to guess not only the speaker's voice and the noise but also the algorithm for noise removal.
What does help me is low frequency pickup. I think the first sample is easy because there are plenty of low frequency components that are later lost through the phone.
Low frequencies are presumably difficult to pick up due to the size of the microphone in a phone, but could there be a way to restore those frequencies through audio processing? It would be interesting to analyze the response of specific microphones to specific low frequencies and find patterns that an audio processor could use to restore the low frequency components.
Anyway, kudos for doing some very interesting work. I don't know how representative my experience is.