Hacker News new | ask | show | jobs
by dhammack 4593 days ago
Nice! I assume LDA was run on the lyrics?
1 comments

Not at all... "documents" were the user's play history and "words" were artists. So if you played one artist twice and another artist once it'd be like a document that says "artist1 artist1 artist2". The assumption is that document topics are analogous to music genres, and each artist creates music within a small set of genres each user prefers music within a small set of genres.
Ah, that explains why the groupings are so good. That's cool, I hadn't thought about topic models outside of NLP before.