The task I gave myself was to subtract out the drum beat (the song graciously gives the isolated loop before the instrument comes in), then mix/baseband the instrument to whatever frequency I wanted. If all went well I would make a complex FIR filter that I would pass tones into.
This model assumes the timbre is independent of the tone, but I can see now that this assumption is quite wrong and something more complicated (like this ML modeling) would be needed.
That synth is extremely distorted post summing of the voices. (That whole album has so much distortion, it's lovely).
So not only is timbre not independent of frequency, summing multiple notes is also non-linear. The "beating" this causes is most obvious on the second chord to play. This beating is not consistent as the notes change, it's based on the difference in frequencies between the two notes being played.
This model assumes the timbre is independent of the tone, but I can see now that this assumption is quite wrong and something more complicated (like this ML modeling) would be needed.