I would redesign your protocol to be request/response based akin to http. Achieve performance by using multiple connections in the client. Simplicity > efficiency especially if you don't have the engineering resources of a company like Google.
And I'm out. The reply rate limiting is infuriating.
It's really easy to glibly criticize someone else's design decisions when 1) you don't have a full understanding of their problem, architecture, or rationale for that architecture, and when 2) the medium of the conversation doesn't lend itself well to providing you a satisfactory explanation.
It seems as though you've gotten the tiniest glimpse of some details about the system and went on to assume he made a boneheaded decision and you know better. Do you have some secret evidence that he's incompetent and doesn't have a good reason for his decision?
And I'm out. The reply rate limiting is infuriating.