|
|
|
|
|
by haxton
1020 days ago
|
|
Curious to know what value you've seen out of these clusters. In my experience k means clustering was very lackluster. Having to define the number of clusters was a big pain point too. You almost certainly want a graph like structure (overlapping communities rather than clusters). But unsupervised clustering was almost entirely ineffective for every use case I had :/ |
|
I mainly like it as another example of the kind of things you can use embeddings for.
My implementation is very naive - it's just this:
I imagine there are all kinds of improvements that could be made to this kind of thing.I'd love to understand if there's a good way to automatically pick an interesting number of clusters, as opposed to picking a number at the start.
https://github.com/simonw/llm-cluster/blob/main/llm_cluster....