I think the real strength of this method is that it doesn't require the data live in a vector space. Once you give up that structure, you're pretty much locked in to using points from the dataset as the cluster representatives, unless you assume some other structure.
A strength if you're strictly looking to minimize squared L2 loss from each point to its closest mean -- but for a lot of other applications, it's a weakness! As the other poster mentioned, with KMedoids you can use arbitrary loss functions and cluster exotic objects (not restricted to metrics on a vector space)