Hacker News new | ask | show | jobs
by kyzyl 2526 days ago
The tools to do this exist. It's usually called 'blind source separation', as in "What are the N distinct audio signals which sum up to best explain a given compound signal, without knowing the possible source signals ahead of time." Usually it's done with some sort of matrix factorization, Principal Component Analysis, and/or Independent Component Analysis. It's also used for non-audio signals, like pulling the discrete firings out of noisy EEG signals. It's definitely not a foolproof solution but in a lot of applications it can get you going, at least.
1 comments

By the problem setup it isn't blind source. It is sound = song plus other. A mixture model with 2 components

Edit. If you know the song it should be something simple like do cross correlation of audio with known song. Find peak. Solve for the gain and subtract away scaled and shifted song from original track. Will be rubbish if gain and timing have errors. Might need to do it in little chunks and interpolate the gain and shifts.

Edit 2. More generally, you might want to worry about the song having passed through some unknown transfer function (i.e. it is being played and recorded through shitty equipment). Then you have an interesting inverse problem. If everything is linear it will involve a regularized deconvolution. Will be tricky then.

It still is reduce-able to the more general blind source problem, right? We can conveniently "forget" that we know what the sources are so now we are blindfolded and can still use the same techniques to solve it.
It will do worse with less assumptions. The more you know the better you can estimate
Sorry I didn't see your responses until now. Indeed, there are many ways to slice the specific problem. I was specifically responding to the parent's statement:

> It would be a game changer if someone were to come up with a novel method of decomposing audio into discrete components

It's something that has been generally addressed and ~works. It will obviously depend on the specifics of the application, and yes if you can constrain the problem space further you ought to do better!