Multiparty conferencing without a media relay is difficult, as bandwidth requirements scale quadratically with the number of participants for a full mesh topology.
You can also have the participant with the beefiest connection act as such a relay, but if every participant is on a metered or upload limited connection, that's not an option.
It's pretty sad actually: We seem to be caught in a perpetual catch-22 of "nobody needs public IP reachability and symmetric bandwidth for home connection, since everyone uses beefy cloud servers anyway" and "we need beefy cloud servers because home connections are asymmetric and NATs are horrible"...
cpu usage on each client is also an issue. Architecturally, WebRTC connections are always peer-to-peer, in the sense that each transport carrying media is negotiated individually and does its own bandwidth shaping.
This has some really nice properties. Doing the bandwidth shaping individually for each transport maximizes video quality for each track, but it also means that in an N-way call you have to encode your outgoing video N-1 times. Even with a perfect network connection, you run out of cpu to do the encoding at some point.
Today, on a pretty new-ish laptop the limit for how many outgoing videos you can encode (<waves hands about codecs and settings>) is ~10. On an older Android phone that limit is ~1.
It's possible to imagine changing how WebRTC works so that separate transports can reuse encoded streams. And hopefully that will become possible at some point (https://www.w3.org/TR/webrtc-nv-use-cases/). But not anytime soon. :-(
* https://galene.org
* https://github.com/pion/ion
* https://github.com/peer-calls/peer-calls
* https://github.com/meetecho/janus-gateway
* https://github.com/versatica/mediasoup
* https://github.com/jitsi/jitsi-meet
* https://github.com/fox-one/mornin.fm