Hacker News new | ask | show | jobs
by finnh 2379 days ago
Even without a priori semantic goals, I think you can also transform/rotate your dimensions to maximize their interpretability.

Simplified example: if your two-dimensional system gives you two points:

    -0.5, 0.5
    0.5, 0.5
Then you losslessly rotate to

    0, 1.0
    1.0, 0
With the idea that the latter is simpler for humans to assign semantics to
2 comments

This is the idea behind non-negative matrix formulation (NMF). As the name implies, it forces the entries of the embedding matrices (for both the reduced document and term matrix) to be nonnegative, which results in a more interpretable “sum of parts” representation. You can really see the difference (compared to LSA/SVD/PCA, which does not have this constraint) when it’s applied to images of faces. Also, NMF has been shown to be equivalent to word2vec. The classic paper is here: http://www.cs.columbia.edu/~blei/fogm/2019F/readings/LeeSeun...

PS—There should be a negative sign on the (2,2) entry of the first matrix.

> non-negative matrix formulation (NMF)

*factorization ;)

Also PCA follows a similar idea as well (I mean, rotating vectors), but it's usually done is a much lower dimensional space

Ugh, that one was auto-correct, I swear. I have no idea what’s going on at Apple’s NLP department.
Is this the same intent that a 'variation autoencoder' would perform?

Also, is it possible in non-variational implementations (like this one) that some of the dimensions represent multiple groups? For example, not just 0.5 and -0.5 groups, but also a 0.0 group in the middle. Then your rotation wouldn't be sufficient, you would need to increase the dimensionality to cleanly separate the groups.