Hacker News new | ask | show | jobs
by nothrowaways 1175 days ago
>3) doesn't require the means to be datapoints in the dataset.

I actually thought this was a k means strength :)

2 comments

I think the real strength of this method is that it doesn't require the data live in a vector space. Once you give up that structure, you're pretty much locked in to using points from the dataset as the cluster representatives, unless you assume some other structure.
A strength if you're strictly looking to minimize squared L2 loss from each point to its closest mean -- but for a lot of other applications, it's a weakness! As the other poster mentioned, with KMedoids you can use arbitrary loss functions and cluster exotic objects (not restricted to metrics on a vector space)