I'd say the justification for all of the large languages (like C++ and Common Lisp) is the same. You may not need all of it, but the features you do need will at least be standardized, compared to using a small language with ad-hoc extensions to achieve the same.
[ I here include the standard library as part of the language, for languages with extensive meta-programming support like C++ and Common Lisp, it makes little sense to distinguish. ]
You've nailed it right on the head there though, when offered a bloated solution, people will 'subset', when offered something small and powerful people will customize.
No, that's still not my point. Each user has a different subset of Word. Hence, the only set that contains all of the features that all users use is Word itself. If everyone uses a different subset, it's not accurate to call it bloated, too large, or complain it has unnecessary features. All features are necessary for someone.