Hacker News new | ask | show | jobs
by jacquesm 1963 days ago
The problem with FFTs is that for the lower frequencies you have very few bins, but at the higher end you get ridiculous accuracy and there is no easy way to make this more linear. Binning on the high end saves some space but doesn't make the low any more accurate.

So you need to run multiple methods in parallel and decide based on the very rough distribution of the energy in the spectrum which method has the biggest chance of success, or, alternatively, to use the output of both methods to drive some logic that will assign a weight to the output of each.

It's a tricky problem, to put it mildly. Also, this is the simplest form of the problem, doing this accurately for multiple pitches at once is much harder.

Another source of inspiration is the 'onsets and frames' software that powers some automated transcription software:

https://github.com/magenta/magenta/tree/master/magenta/model...

I think if this code is over your head that maybe a good introduction course on signal processing would be a nice thing to have under your belt.

Best of luck!