Yes its simply missing in the WebRTC spec. webRTC defines end-to-end encryption between two peers. But if you want to transmit data to many peers you need a server which is doing the fanout so the encryption is client1<->server and server<->client2.
This is true for all WebRTC implementations/services. They all state having end-to-end encryption but dont tell you that it means something different in WebRTC contexts.
As I understand, Discord server doesn't need to do audio processing, all mixing is done on the clients. So it would benefit from the "one-to-many" encryption, because currently it has to decrypt from one p2p connection and to encrypt to several p2p connections when someone talks (which breaks the end-to-end).
This is true for all WebRTC implementations/services. They all state having end-to-end encryption but dont tell you that it means something different in WebRTC contexts.
PERC will solve this one day, but its sadly just a draft: https://webrtcglossary.com/perc/