depends, if all requests are parallel they should hit different instances and that would distribute load more efficiently instead of pinning to single instance. You would actually get response in 200ms not to mention that your ability to properly size each node is increased. It also enables you to have a grey area where response can be partial and not just fail/pass. As usual YMMW depending on use case.
I disagree. You're shifting your "organization" to the frontend which becomes an utter cesspool of chaos. If anything changes in the backend, you will need to change the frontend as well.
The whole point of an API is to decouple. GraphQL does the exact opposite.