Hacker News new | ask | show | jobs
by sumit_psp 4697 days ago
Isn't it similar to what LSA(http://en.wikipedia.org/wiki/Latent_semantic_analysis) does?
4 comments

From the linked paper:

Many different types of models were proposed for estimating continuous representations of words, including the well-known Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). In this paper, we focus on distributed representations of words learned by neural networks, as it was previously shown that they perform significantly better than LSA for preserving linear regularities among words

It is similar, but at least according to what Mikolov wrote as a response to reviewer comments regarding LDA/LSA/tf-idf [1], LDA does not preserve linguistic regularities such as king - man + woman ~ queen. I asked for additional clarification, but so far I haven't received a reply.

A good intuition as to why these kinds of regularities could even exist was given by Chris Quirk as a blog comment [2]. Essentially, imagine that each word is approximately represented by the contexts it appears in, if so, swapping in and out the contexts of other words could indeed preserve some linguistic regularities.

[1]: http://openreview.net/document/7b076554-87ba-4e1e-b7cc-2ac10...

[2]: http://www.blogger.com/comment.g?blogID=19803222&postID=5373...

Yeah, the bag of words and K-Rank reduction stuff basically is LSA.

I've written an LSA implementation a few years ago for a BI product ( written about http://www.innoveerpunt.nl/interactief-innoveren/innoveerpun... sorry that it's Dutch :) ).

I wonder how well it works; my takeaway was that you need to tweak the internal thresholds and matrix sizes a lot to get the optimal results, which in turn is highly dependant on the datasets you use (which is also made very clear in every LSA paper you'll read).

Latent Relational Analysis is more in the spirit of LSA, see http://research.microsoft.com/apps/video/dl.aspx?id=104771