Hacker News new | ask | show | jobs
by HanClinto 775 days ago
What is being used to build the data map -- how does one project these document vectors into 2D space?
1 comments

OP is also the author of the popular dimensionality reduction algorithm UMAP.

I guess the pipeline was embedding documents with an LLM (or even plain old word2vec average over the abstract might do it), and then reducing that to 2 dimensions with a cosine similarity metric with the help of UMAP.

I have no idea about colors and local cluster naming though. Maybe that's handcrafted.