The actual identification of individual songs is done using the Echonest API (they have a huge database of song hashes).
Before this can be done however, the placement of each song within the audio track as a whole needs to be determined, which is done with a combination of ffmpeg and C. Currently this process is quite slow and inaccurate - I'm trying to learn some ML theory at the moment with the hope of improving it!
Great work and a really interesting read! Really nice performance too - for reference Echonest advises samples be 30s or longer for optimum recognition.
Before this can be done however, the placement of each song within the audio track as a whole needs to be determined, which is done with a combination of ffmpeg and C. Currently this process is quite slow and inaccurate - I'm trying to learn some ML theory at the moment with the hope of improving it!