Hacker News new | ask | show | jobs
by kherud 1007 days ago
Thank you for sharing! On a tangent: I'm wondering if there are any good open source models/libraries to reconstruct audio quality. I'm thinking about an end-to-end open source alternative to something like Adobe Podcast [1] to make noisy recordings sound professional. Anecdotally it's supposed to be very good. In a recent search, I haven't found anything convincing. In my naive view this tasks seems much simpler than audio generation and the demand far bigger, since not everyone has a professional audio setup ready at all times.

[1] https://podcast.adobe.com/

5 comments

We've been researching an audio denoiser for music that we will present at the AES conference in October. Description page: https://tape.it/denoising

We'll also publish a webapp where you can use the denoiser for free. Mail me if you want beta access to it (email in profile).

It won't be open-source though, although the paper will of course be public. It will also only reduce noise, and not reconstruct other aspects of audio quality. However, it can do so on any audio (in particular music), not just speech like Adobe Podcast, and it fully preserves the audio quality. It's designed exactly for the use case you want: to make noisy recordings sound professional.

Are you sure the demo sound files are correct on the website? Couldn't appreciate any glaringly obvious differences between the original and denoised with studio grade headphones here. Or, the originals aren't noisy enough.
denoising seems to fail in the guitar and vocals example
Can you clarify where it fails? It's designed to remove stationary noise only, and removes it very well in the guitar and vocals example.

Generally speaking, if you have other sounds that you don't want in the audio, we don't remove them - it's hard to decide from a musical point of view whether you want a certain sound or not. To give an extreme example: a barking dog probably doesn't belong into a Zoom conference, but it may very well belong into your audio recording. Removing such elements would be a creative decision.

The guitar and vocals example has certain clicks in the background that we don't remove - but the stationary noise is gone. Existing professional (and complex) audio restoration tools like iZotope RX don't remove those clicks, either. It's a conservative approach, sure, but in return you can throw any audio at it and it always improves it.

It’s not open but Nvidia has RTX Voice for free if you have and Nvidia card.

Only weird thing it’s designed to be used real time but I’ve had some luck on cleaning up voice recordings replayed back through it via audio routing.

There seems to have been a fork in the road:

On one side the tech for literal denoising has stagnated a bit. It’s a very hard problem to remove all noise while keeping things like transients.

On the other side, AI is being rapidly developed for it’s ability to denoise by recreating the recording, just without the noise.

In our denoiser (see other comment), we worked on combining these two forks. That’s how we can mathematically guarantee great audio quality.

This combination was non-trivial as training old school DSP denoisers is not easily possible. We’ll describe the math needed in our paper. We hope our publication will help the wider community work not just on denoising but also tasks like automatic mixing.

I have had a lot of success with this: https://ultimatevocalremover.com/ for de-noising
https://youtu.be/o-kJ4_CuWzA

This video from MKBHD's studio channel dives into this topic