|
|
|
|
|
by Houshalter
3806 days ago
|
|
The diagram shown is only a visualization. The actual word vectors have many dimensions. To reduce them to 2 dimensions, they use a method which tries to keep vectors that are similar as close to each other as possible, but also unsimilar words apart. This creates the shape seen on the scatter plot. Just looking at the scatter plot by itself doesn't tell you anything about the underlying data. |
|
From the guy who helped make t-sne:
When I run t-SNE, I get a strange ‘ball’ with uniformly distributed points?
This usually indicates you set your perplexity way too high. All points now want to be equidistant. The result you got is the closest you can get to equidistant points as is possible in two dimensions. If lowering the perplexity doesn’t help, you might have run into the problem described in the next question. Similar effects may also occur when you use highly non-metric similarities as input.