Hacker News new | ask | show | jobs
by kevincox 1368 days ago
> WebRTC is encrypted generally even if you leak metadata like IP address.

Yes, WebRTC does end-to-end encryption by default. The IP is "leaked" because the peers directly connect to one another, so they will naturally require each others' IP address (which is required to talk to one another).

There are both upsides and downsides to direct P2P connections.

1. Pro: The minimal number of parties can analyze the call.

2. Pro: The call depends on a minimal number of parties.

3. Pro: The call is generally more performant, limited only by the connection between both peers.

4. Pro: No need for third-party services other than a network connection.

5. Con: The peer learns your IP which may be used to help identify you or DoS your internet connection.

6. Con: Intermediates anywhere on the network can see which two peers are talking. (With a SFU only the SFU knows the ends of the connection for sure)

> Is Cloudflare stating they will be the middleman and therefore have access to the decrypted video stream?

I see nothing in this article that suggests that they will have access to the decrypted video. However I wouldn't be surprised if that is added in the future.

The reason is that in order to to big calls you need to support multi-quality streams. This can in theory be done on decrypted connections but not all browsers support this right now (notably Firefox). So if you want the widest support you need to do video transcoding at the SFU.

There are also other features such as recording and live-streaming that (generally) require access to the raw video. (Of course this can be done as adding the recorder/streamer as a "peer" to the E2EE call when needed, but that is still giving the keys to the company at this point).

3 comments

Regarding performance: we've been collecting (anonymized) data from real-world WebRTC calls for several years, and sadly it's no longer true that p2p routes are generally more performant.

It definitely used to be true that most p2p routes were lower latency than bouncing through a server at, say, an AWS data center. In 2019 we looked closely at this and it was fairly rare to see cases where latency was improved by switching over from a p2p connection to an SFU (media server) connection. Now, the reverse is true. It's usually the case that routing through a media server at AWS (or any other major provider) is as good or better than a p2p route between any two end users.

Early in the pandemic, we assumed this was a temporary thing. ISPs had not built out their networks expecting much upstream traffic. But they'd adjust.

well, ISPs have evolved. Now we see much better performance in general than we did early in the pandemic. But we still see better performance to "the backbone" than we do between ISPs.

Another step in the Internet become less of a decentralized network, perhaps.

WebRTC is end-to-end encrypted to the peer. So you're right, when you do actual peer to peer WebRTC between you and another user in a browser, you have end-to-end encrypted communication. When you go through a server, it's just another peer. So the word end maybe doesn't fit anymore, because it's a server that is the peer and they can decrypt the stream. Transcoding is pretty common at that stage because it's helpful for scaling.
That isn't necessarily true. I guess it is a bit opaque but when you negotiate a WebRTC connection you get a key and a list of network endpoints that you can use. It is entirely possible to add a proxy server in that list of endpoints without giving the proxy server a key as far as I am aware.

That being said for big calls you start wanting to do selective forwarding and you probably need to drop down to a lower layer in the WebRTC stack to manage this and allowing the Selective Forwarding Unit (SFU) to be allowed to drop chunks without messing up the connection. However it is definitely possible to do all of this over WebRTC with full E2E encryption (see Jitsi Meet).

With today's browser implementations of WebRTC, you can proxy through a TURN server while still maintaining end-to-end encryption, but you can't proxy through any kind of customized endpoint/server, because each endpoint necessarily has the encryption keys as part of the session negotiation.

Chrome implements experimental user-space media stream processing APIs that allows you to build "end-to-end encryption" at the javascript level. But, to me at least, it's a bit hand-wavy to call that "end-to-end encryption" because the keys are created, managed, and accessible from user-space. And neither Safari nor Chrome yet support these APIs.

There's ongoing work on this: https://datatracker.ietf.org/wg/perc/documents/

Regarding Pro #4: Wouldn't you still need a signaling server to establish that P2P connection and handle network switches and reconnections and such?
That's a good point. You do need something to do the negotiation. However this is not an intensive task and there are a handful of approaches that can avoid needing a dedicated third-party.

1. IPFS PubSub can be used for sharing this info (although you do still need to bootstrap the IPFS DHT).

2. You can share blobs over text chat. (Including services like Jami which are distributed)