|
|
|
|
|
by keeeba
231 days ago
|
|
I agree it is a profound question. My thesis is fairly boring. For any given clustering task of interest, there is no single value of K. Clustering & unsupervised machine learning is as much about creating meaning and structure as it is about discovering or revealing it. Take the case of biological taxonomy, what K will best segment the animal kingdom? There is no true value of K. If your answer is for a child, maybe it’ 7 corresponding to what we’re taught in school - mammals, birds, reptiles, amphibians, fish, and invertebrates. If your answer is for a zoologist, obviously this won’t do. Every clustering task of interest is like this. And I say of interest because clustering things like digits in the classic MNIST dataset is better posed as a classification problem - the categories are defined analytically. |
|