Hacker News new | ask | show | jobs
by Hizonner 479 days ago
They literally just represent clusters in a learned embedding vector space that's not necessarily well understood, but is believed to map words or phrases with similar meanings to vectors that point in similar directions in a high-dimensional space. The axes themselves don't have any understandable meanings.

https://en.wikipedia.org/wiki/T-distributed_stochastic_neigh...

1 comments

I appreciate you trying to help, but I think you are misunderstanding my complaint. The issue here is that even with embeddings, things get labels. Clusters have labels (organized by color) and individual data have labels. Nether of these are well defined so it is not super clear what is being said.

(I'm actually well aware of T-SNE. FYI, it is not a great tool to use and people often conflate it with PCA or dimensionality reductions. Probably fine here because it is concerned with grouping.)