Hacker News new | ask | show | jobs
by bottom999mottob 589 days ago
This is really cool, but there's real-world instrument physics that might not be captured by simple Fourier transform templates, like a trumpet playing softly can have a significantly different harmonic spectrum than the same trumpet playing loudly, even at the same pitch

Trumpets produce a rich harmonic series with strong overtones, meaning their Fourier transform would show prominent peaks at integer multiples of the fundamental frequency. Instruments like flutes have more pure tones, but brass instruments typically have stronger higher harmonics, which would lead to more complex partial derivatives in the matrix equation shown in the article

So this script uses bandpass filtering and cross-correlation of attack/release envelopes to identify note timing. Given that brass instruments can exhibit non-linear behavior where the harmonic content changes significantly with playing intensity (think of the brightness difference between pp and ff passages), not sure how would this algorithm could handle intensity-dependent timbral variations. I'd consider adding intensity-dependent Fourier templates for each instrument to improve accuracy

1 comments

As someone who uses source separation twice a week for mixing purposes the number of other instruments that can produce sounds of "vocal" quality is high. These models all stop functiining well when you have bands where the instruments don't sound typical and aren't played and/or mixed in a way that achieves maximum separation between them — e.g. an electrical guitar with a distorted harmonic hitting the same note as your singer while the drummer plays only shrieking noises on their cymbals and the bass player simulates a punching kick drum on their instrument.

In these situations (experimental music) source separation will produce completely unpredictable results, thst may or may not be useful for musical rebalancing.

What tool do you use for the source separation? Everything I've used so far is great for learning or transcribing to MIDI but the separated tracks always have a strange phasing sound to them. Are you doing something to clean that up before mixing back in or are the results already good enough?
iZotope RX with musical rebalance, great to reduce drum spill from vocal mics