'language' as we know is quite ambiguous and an inefficient way of communication.
my guess is that what they will probably evolve is some kind of an audio encoding format for the exact information they want to convey to the other entity, like literally sending it control inputs over audio.
Given that the AIs were playing with humans, perhaps using speech to text/text to speech as an interface in order to be "human compatible" would overcome some of that.