| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by genevoronkov 4824 days ago
	I mirrored this implementation a while ago since the full source isn't available. It was not nearly as successful as the blogger portrays. For example, if I used a high quality wav mono file to create a fingerprint it would have a hard time identifying a track that is an mp3. It seems the maximums actually get shifted and merged from compression. In other words there's a reason shazam uses entropy based anchor points to help it pick hashing values.

1 comments

bmohlenhoff 4824 days ago

I'm wondering if they bound the fingerprint search to human audible frequencies. MP3 compression, as a lossy codec, works by discarding information in the input signal that corresponds to inaudible frequencies. I believe this could be mirrored in the implementation by running the frequency domain peak-pick algorithm only over specific bin ranges.

link

genevoronkov 4824 days ago

I don't recall if the paper specifies the frequency ranges used but my implementation was bound to audible frequencies. I was going to use hill climbing search to find optimal frequency ranges but came to the conclusion my implementation was too flawed regardless. If I looked at the two graphs side by side(compressed vs uncompressed) they looked nothing alike. For example, the peak might be in the same region but it would be shifted.

link

willvarfar 4824 days ago

http://www.redcode.nl/blog/2012/03/devoxx-2011-talk-freely-a... has a demo in the video

link