Those moving parts are defined by the complexity of the business. Banking software with 20000 classes deployed in J2EE application server on a mainframe would not be much different.
That's not really true, since microservices involve what boils down to RPC over a network. There are so many more failure modes involved when you have 2000 asynchronous processes talking to one.
That is true, but it also allows for much more orderly start-up and shut-down as well as automatic recovery. A service is a pretty well defined entity that can be exhaustively tested far easier than the corresponding monolith with 2000 classes and tons of non-local effects. To use processes for that purpose has definitive advantages. See "Erlang/OTP" for an example of how this can give you incredibly solid distributed architectures.