|
|
|
|
|
by nerdponx
1558 days ago
|
|
Note also that specifically for one-dimensional data, there is a globally optimal solution to the k-means clustering problem. There is an R package that implements it using a C++ core implementation [1], and also a Python wrapper [2]. This implementation is also surprisingly fast, so you can use it to brute-force check many different numbers of clusters and check using silhouette distance. The advantage over traditional k-means is that you don't need to check multiple initializations for any given number of clusters, because the algorithm is deterministic and guaranteed to find the global optimum. [1]: https://cran.r-project.org/package=Ckmeans.1d.dp [2]: https://github.com/djdt/ckwrap |
|