Are more than one instrument or pitch sounding simultaneously? If so, it is almost impossible. You could try deep learning, but there's no guarantee that your results would be transferable.
There's a couple source separation algorithms that do a pretty good job splitting the audio into single instrument tracks (I tried open_unmix) -- I'm struggling to transform the single instrument audio into frequencies or notes from that one instrument track.