| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zahllos 2277 days ago

Yes.

Firstly, classical VOIP including ones involving video have two stages: a signalling protocol called session initialization protocol (SIP). Secondly they have a real-time protocol, or RTP. SIP is used so that clients can find each other over the internet. If they can punch through their firewall, great. If not they might need help from STUN servers. Once done, the clients then talk to each other using RTP directly, not via the central server.

In the SRTP variant, this involves using DTLS, i.e. peer-to-peer TLS.

Wire does calling, except the SIP part is replaced by the existing messaging system they already have: the double ratchet. So if you want to call, that's just another end to end encrypted state of the art message exchange to signal that fact and agree a key.

The original RFC for SRTP isn't great and many of those options are still supported in implementations. Here's the RFC for SRTP: https://tools.ietf.org/html/rfc3711#section-4 . This document would likely result in tptacek banging his head so much on his keyboard that he manages to write a perl script capable of mind control that runs wild, turns on its creator and turns him into a PGP and DNSSEC loving zealot (presumably the motivation would be spite for having been written in perl). Or with a little less hyberbole, the crypto is a product of the times in which that spec was written, and we'd do better now.

I mean... NULL cipher? We have that option already. It's just plain RTP without the S.

According to this survey of SRTP security: https://tools.ietf.org/html/rfc7201 some sane options (AES-GCM and to a certain extent AES-CCM, but AES-SIV would have been better) are available, defined in rfc 7714. I have found this post from 2017 indicating that GCM support exists in the webrtc library: https://groups.google.com/forum/#!topic/discuss-webrtc/fz3kh...

I can't see any evidence in the codebase right now that there's any attempt to do anything custom with SRTP for Wire (SRTP used because of WebRTC), so, I assume that they're simply using whatever is available by default for video/voice as provided by Electron and/or your browser.

SRTP also leaks metadata all over the show: https://webrtc-security.github.io/#4.3.4.

I can't find any evidence existing audits have looked into the security of the voice/video protocol. They appear mostly to have focused on: a) the X3DH/Signal Protocol implementation and b) a high level audit of the whole application.

So far we're just talking point-to-point. Things get fun when you want group chat: do you a) establish any-to-any multicast (can be encrypted e2e) and let clients composite the video onto their (possibly low powered phone) screen or b) provide and RTP mixer (mitm), that needs the raw unencrypted video to combine it and provide a unified stream? Presumably Zoom opted for the latter; as far as I understand it Wire etc have opted for the former.

I don't own any Apple Devices so it's hard to look into their offerings, but MDG in the article states they've managed it. For Text, having copied Signal, Wire have some claim to be the most secure platform (or rather, one of them). However, I suspect that actually everybody in this space sucks really badly and nobody is actually looking into it.

In short, the security of text messages is about as state of the art as you can get. The security of video/voice, for _all_ "secure messengers", likely leaks a bit of metadata at best, and might use some funky 2000s era crypto at worst.

tl;dr they're using exactly the same standards for voice/video as everyone else, no secret sauce.