| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stefanka 1019 days ago
	You could also use a Bayesian version of kmeans. It applies a Dirichlet process as a prior to an infinite (truncated) set of clusters such that the most probable number k is automatically found. I found one implementation here: https://github.com/vsmolyakov/DP_means Alternatively, there is a Bayesian GMM in sklearn. When you restrict it to diagonal Covariance matrices, you should be fine in high dimensions

1 comments

Having close centers might help with the labeling. Let me know if I can help