|
|
|
|
|
by BenoitP
773 days ago
|
|
OP is also the author of the popular dimensionality reduction algorithm UMAP. I guess the pipeline was embedding documents with an LLM (or even plain old word2vec average over the abstract might do it), and then reducing that to 2 dimensions with a cosine similarity metric with the help of UMAP. I have no idea about colors and local cluster naming though. Maybe that's handcrafted. |
|