Hacker News new | ask | show | jobs
by rodw 1655 days ago
Can you expand on what you mean by "song recommendations" in this context? Do you mean recommendations like "if you like X you might also like Y"?

Assuming the answer is "yes", I'm not sure if I follow the leap from "non-negative matrix factorization as a bank of note templates (from their spectrograms)" to "song recommendations".

Very loosely speaking my (not wholly uninformed) interpretation of the "note templates" bit is sorta analogous to DFFT analysis with frequency bins centered around the "regular" notes of the chromatic scale - i.e., the cells of the matrix represent signal strength for the frequencies that correspond to conventional western notes (e.g. integer-valued midi note numbers) or (in aggregated form) octave-independent pitch-classes rather than arbitrary frequencies (that might fall between two conventional notes). It's a very useful representation of the component frequencies that appear in conventional music but at the end of the day it's more or less "here are the notes (or pitch classes) active for this given beat".

That is, I can imagine the role this information - essentially the musical score itself, or at least the pitch-specific dimension of that score - could play in a song-recommendation engine, but I'm curious how/why that specific spectrogram-template-based representation is significant.

Are you suggesting that that specific representation could be applied to song recommendations in a way that similar polyphonic-pitch-per-beat information derived from some other algorithm (FFT analysis for a hypothetical example) could not?

Or maybe I've misinterpreted your comment entirely?

1 comments

More likely, they mean using non-negative matrix factorization but with a bank of feature vectors instead of note templates. NNMF can be used in a wide variety of domains because it essentially encodes the problem of "this thing is a bit like this thing, a bit like this thing, and a bit like this other thing".

If instead of numbers representing intensity at different frequencies (as in the spectrograms), the numbers in each vector of the template bank represent other features (such as listener overlap with other artists/songs, or genre representation across multiple continuous "color" axes) then you can recommend music to a listener based on the similarity to songs in their library to those in the template bank.

Ack'd. That makes more sense. I guess I took the comment too literally.