| HN Mirror

OP is also the author of the popular dimensionality reduction algorithm UMAP.

I guess the pipeline was embedding documents with an LLM (or even plain old word2vec average over the abstract might do it), and then reducing that to 2 dimensions with a cosine similarity metric with the help of UMAP.

I have no idea about colors and local cluster naming though. Maybe that's handcrafted.