| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rm999 3629 days ago

I don't get the innovation in this paper - are they just running word2vec on groups of items? If so, Spotify has been doing this on playlists for years now: https://erikbern.com/2013/11/02/model-benchmarks/

Also, I know the paper isn't claiming state-of-the-art, but their SVD results are horrendous. Standard CF would create much better artist-artist pairings with even a medium sized dataset.

As an aside, I've run some quantitative and qualitative tests and have found the best recommendations come from a combination of user-item and item-item. I co-gave a talk at the NYC machine learning meetup recently (https://docs.google.com/presentation/d/1S5Cizi9LFQ7l0bMYtY7g...) that shows how this can work, starting at slide 20. The idea is to create a candidate list of matches using item-item, and then reorder using item-user. I've found this creates "sensible" suggestions using item-item, but truly personalizes when re-ordering. You can remove obvious recommendations by removing popular matches or matches the user has already interacted with (I consider this a business decision rather than something inherent in the algorithm).

5 comments

meeper16 3629 days ago

Spotify got this from Berkeley Lab who were doing it in 2005 "Word2Vec is based on an approach from Lawrence Berkeley National Lab" https://www.kaggle.com/c/word2vec-nlp-tutorial/forums/t/1234... which is interesting because the original streaming music site, seeqpod, who powered spotify, was based on vectors for songs, like a song2vec.

link

rahimnathwani 3629 days ago

From the Spotify blog post: "We train a model on subsampled (5%) playlist data using skip-grams and 40 factors."

Any idea what those 40 factors might be?

(The item2vec paper describes using pairs of items that occur in the same set, i.e. just like using n-grams, but without a fixed n, and ignoring ordering.)

link

rm999 3629 days ago

That's the dimensionality of the resulting word vectors in word2vec; in the item2vec paper this is the "dimension parameter m".

link

3pt14159 3629 days ago

Yeah, I "invented" this in 2011 or 2012 and it was one of the ideas behind the company that I sold. At the time I thought it was a clever hack, but I didn't see it as especially non-obvious.

link

neeraj1987 3626 days ago

hi,very informative talk; especially with those examples for handling cold start and seeding. any pointers on how the multiple entities are incorporated in the interaction matrix? I understand how user/item attributes may be incorporated in the interaction matrix but multiple entities is something that I am struggling to understand. Pointers to associated literature would help too.

link

rm999 3626 days ago

This paper covers mixing different types: http://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf (this paper covers a related but different technique). See figure 1 for an example of mixing ratings, indicator variables, and time into a single matrix.

link

neeraj1987 3622 days ago

Thanks a lot :-)

link

etangent 3629 days ago

"[before computing SVD], we normalized each entry according to the square root of the product of its row and column sums."

Why didn't they use something that usually works better, like PMI?

link

rspeer 3629 days ago

This is a normalization that I have used and seen other people use. I don't think it's a foregone conclusion that PMI is better for every task.

link