Hacker News new | ask | show | jobs
by mlthoughts2018 2415 days ago
Your final paragraph is reasonable, and I do think the post would benefit from more clearly talking about this.

The “trick” is to find some naturally occurring property that indicates a pair of examples has the positive label, such as two users selecting a given product, two text queries leading to clicking on the same image, two different blog posts with the same keyword, etc. This combined with an efficient way to sample acceptable negative pairs that do not express the trait that indicates positive label.

That + deciding how to approach the loss function (e.g. cosine similarity, euclidean distance, distance in a hash space, triplet loss, should it have a margin) is the whole trick of the problem.

The black box that produces embeddings (DNN, doc2vec, sparse vector from tfidf-like features, whatever) and the nearest neighbor piece are standard plug and play.

1 comments

thanks, I think we are in agreeance.