My point is that if one can guess the password on a random test box and through that gain access to critical internal systems, you have lost the right to call your system "not vulnerable".
Well, your "legacy non production test tenant" can be opened by just guessing passwords, and it allows access to "very much in use production non-test" tenants, then you could say MS has a vulnerability. It may not be a buffer overflow, but it is a vulnerability nonetheless.
Yes, and I think most people would consider it a vulnerability if an authentication system doesn't rate-limit or otherwise slow/stop "password spray" attacks.
You can rate limit individual users but password spray attacks use a large number of accounts to remain undetected in a authentication system used by an even more users.