Hacker News new | ask | show | jobs
by dest 3130 days ago
You mean 2d arrays containing the raw audio signal? No, this would not work because you do not know the phase along the y dimension when you want to compare to another signal.

Another method to detect an audio pattern is cross correlation on the raw audio signal. But it is very expensive in computation power and memory.

The longest operation with fingerprinting is often the DB query that is associated. Lots of work to do there. In that space, Will Drevo's work is really good. I will share my DB implementation later.

1 comments

I meant the spectrogram encoded as a 2d array, but I guess there isn't a big difference when the db query is the most expensive part.

I've always wondered: Is there a way to compare fingerprints with humming sounds or live recordings?

Those fingerprinting techniques don't seem to be suitable for those tasks, do you know of any methods to accomplish this?

You have special fingerprint algorithms that are suited for sound modifications like pitch https://biblio.ugent.be/publication/5754913 but it's not going to work with humming or live audio. I don't know if such a thing exists.

If you want to do some research, here is a short review paper on the topic http://www.cs.toronto.edu/~dross/ChandrasekharSharifiRoss_IS...

As for 2d array spectrogram, it is not needed in my lib (expect when plotting is activated). I only care about maxima in the spectrum of each data window. In other words, 1d spectra are enough.