| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pcovington 3573 days ago
	The video embeddings in the paper are learned purely based on observing what users co-watch in sessions. In this sense, they can be thought of as latent factors in more traditional collaborative filtering approaches. When we inspect them, nearby vectors have a surprising amount of semantic similarity. Features about the videos such as titles and tags, as well as features derived from audio and video, are introduced in the ranking phase.

1 comments

eva1984 3573 days ago

Are you guys using similar model like Word2Vec for obtaining the video embeddings?

link

pcovington 3573 days ago

word2vec did inspire earlier iterations of the model, but the key insight is that embeddings are learned jointly with all other model parameters. There is no separate source of embeddings. This way, embeddings are specialized for the the specific task.

link

neeraj1987 3571 days ago

In general what could be a separate source of embeddings? Also, how do these embeddings compare against traditional CF based latent factors?(I ask this in terms of a recommender metric and not complexity)

link