Hacker News new | ask | show | jobs
by Xeoncross 1362 days ago
> "With a traditional WebRTC implementation, both the patient and therapist’s devices would talk directly with each other, leading to exposure of potentially sensitive data such as the IP address... When using Calls, you are still using WebRTC, but the individual participants are connecting to the Cloudflare network. If four people are on a video call powered by Cloudflare Calls, each of the four participants' devices will be talking only with the Cloudflare network. To your end users, the experience will feel just like a peer-to-peer call, only with added security and privacy upside.

Can someone clarify this? WebRTC is encrypted generally even if you leak metadata like IP address. Is Cloudflare stating they will be the middleman and therefore have access to the decrypted video stream?

6 comments

As far as I understood it: the premise of added security is based on the fact that the other WebRTC peers only see Cloudflare's IP instead of your own. Also nobody knows who you are exactly talking to except Cloudflare. I would still expect that the media channels itself still remain encrypted when even when multiplexed by Cloudflare's network.

edit, yes it's encrypted:

> Finally, all video and audio traffic that passes through Cloudflare Calls is encrypted by default. Calls leverages existing Cloudflare products including Argo to route the video and audio content in a secure and efficient manner.

It doesn't say that Cloudflare can't or doesn't access the encrypted data. It seems to be written in a way that everyone would assume they can't but AFAICT it doesn't explicitly say it. Which makes me think they phrased it like this for a reason but I definitely could be wrong.
> WebRTC is encrypted generally even if you leak metadata like IP address.

Yes, WebRTC does end-to-end encryption by default. The IP is "leaked" because the peers directly connect to one another, so they will naturally require each others' IP address (which is required to talk to one another).

There are both upsides and downsides to direct P2P connections.

1. Pro: The minimal number of parties can analyze the call.

2. Pro: The call depends on a minimal number of parties.

3. Pro: The call is generally more performant, limited only by the connection between both peers.

4. Pro: No need for third-party services other than a network connection.

5. Con: The peer learns your IP which may be used to help identify you or DoS your internet connection.

6. Con: Intermediates anywhere on the network can see which two peers are talking. (With a SFU only the SFU knows the ends of the connection for sure)

> Is Cloudflare stating they will be the middleman and therefore have access to the decrypted video stream?

I see nothing in this article that suggests that they will have access to the decrypted video. However I wouldn't be surprised if that is added in the future.

The reason is that in order to to big calls you need to support multi-quality streams. This can in theory be done on decrypted connections but not all browsers support this right now (notably Firefox). So if you want the widest support you need to do video transcoding at the SFU.

There are also other features such as recording and live-streaming that (generally) require access to the raw video. (Of course this can be done as adding the recorder/streamer as a "peer" to the E2EE call when needed, but that is still giving the keys to the company at this point).

Regarding performance: we've been collecting (anonymized) data from real-world WebRTC calls for several years, and sadly it's no longer true that p2p routes are generally more performant.

It definitely used to be true that most p2p routes were lower latency than bouncing through a server at, say, an AWS data center. In 2019 we looked closely at this and it was fairly rare to see cases where latency was improved by switching over from a p2p connection to an SFU (media server) connection. Now, the reverse is true. It's usually the case that routing through a media server at AWS (or any other major provider) is as good or better than a p2p route between any two end users.

Early in the pandemic, we assumed this was a temporary thing. ISPs had not built out their networks expecting much upstream traffic. But they'd adjust.

well, ISPs have evolved. Now we see much better performance in general than we did early in the pandemic. But we still see better performance to "the backbone" than we do between ISPs.

Another step in the Internet become less of a decentralized network, perhaps.

WebRTC is end-to-end encrypted to the peer. So you're right, when you do actual peer to peer WebRTC between you and another user in a browser, you have end-to-end encrypted communication. When you go through a server, it's just another peer. So the word end maybe doesn't fit anymore, because it's a server that is the peer and they can decrypt the stream. Transcoding is pretty common at that stage because it's helpful for scaling.
That isn't necessarily true. I guess it is a bit opaque but when you negotiate a WebRTC connection you get a key and a list of network endpoints that you can use. It is entirely possible to add a proxy server in that list of endpoints without giving the proxy server a key as far as I am aware.

That being said for big calls you start wanting to do selective forwarding and you probably need to drop down to a lower layer in the WebRTC stack to manage this and allowing the Selective Forwarding Unit (SFU) to be allowed to drop chunks without messing up the connection. However it is definitely possible to do all of this over WebRTC with full E2E encryption (see Jitsi Meet).

With today's browser implementations of WebRTC, you can proxy through a TURN server while still maintaining end-to-end encryption, but you can't proxy through any kind of customized endpoint/server, because each endpoint necessarily has the encryption keys as part of the session negotiation.

Chrome implements experimental user-space media stream processing APIs that allows you to build "end-to-end encryption" at the javascript level. But, to me at least, it's a bit hand-wavy to call that "end-to-end encryption" because the keys are created, managed, and accessible from user-space. And neither Safari nor Chrome yet support these APIs.

There's ongoing work on this: https://datatracker.ietf.org/wg/perc/documents/

Regarding Pro #4: Wouldn't you still need a signaling server to establish that P2P connection and handle network switches and reconnections and such?
That's a good point. You do need something to do the negotiation. However this is not an intensive task and there are a handful of approaches that can avoid needing a dedicated third-party.

1. IPFS PubSub can be used for sharing this info (although you do still need to bootstrap the IPFS DHT).

2. You can share blobs over text chat. (Including services like Jami which are distributed)

Instead of "patient and therapist" a better example might be "livestreamer and griefer"

A traditional form of griefing is "get your victim's IP address from a direct-connecting service like Skype and DDOS them while they're doing something latency-sensitive"

"If you don't trust each other, trust us": a very understandable value proposition. Also very understandable trade-offs.
Personally I wouldn't want even more people involved in my medical communications. Imagine if this data was leaked or sold and then used against you, like say for instance, receiving a consultation or help regarding an abortion or drug habit.
Cloudflare has to have access to the decrypted video just like Google Meet does, because browsers by default encrypt to the peer and they are the peer. After all these years there still isn't universal browser support for actual E2EE. What support does exist uses hacky APIs.
> the patient and therapist’s devices would talk directly with each other

For health data that's what you want. So I'd like to see how it would go with the GDPR agencies if used in a medical app.