Hacker News new | ask | show | jobs
by bmohlenhoff 4776 days ago
After using Shazam, I was kind of hoping there was more to it than just a time windowed frequency domain peak-pick algorithm. The algorithm itself is pretty basic from a signal processing perspective, but I think the key insight here was that the results are unique enough to store off and compare other samples against at some later point in time.
4 comments

Yeah, the magic (if there is any) is doing the match across a silly amount of songs in a relatively short time. Not groundbreaking exactly, but operationally quite interesting.
I actually remember first using this by dialling 2580 about 10 years ago. At the time it felt truly magical.
Are there other uses for this algorithm/technique, when applied to signals other than audio? I mean apart from identifying a source from a small clip.
This type of analysis is commonly used in tons of things, like communications systems, image processing, radar, etc. I used a similar technique when trying to identify an underutilized wifi channel in the vicinity of my apartment.
IIRC there have been a number of papers on using a similar technique with speech to text applications.
Well I'm sure they must be using a few tricks in their implementation. I've always been interested in knowing how Shazam actually works and had in mind that they must somehow split a song in intervals and "hash" every interval, then store them in some kind of indexed database for fast retrieval. Seems I was not too far off:)
Yeah, this is the obvious implementation. As he said in his follow-up post:

>And second, I’d like to know which patents are in play. Because I just couldn’t think that something this easy (music-fingerprint is a hash, and we do a lookup) can be patented.. Maybe in the States, but in Europe?