Hacker News new | ask | show | jobs
by doctorpangloss 993 days ago
Why are you using Pion instead of Coturn? Is there a lot of innovation left on table for TURN relays?

> ML-enabled tools layered on top.

In your opinion, have Encoded Transforms definitively settled the approach to this matter?

If I want to use free PyTorch garbage X on my camera stream Y, I can take a big hit on occupancy and latency, sometimes as high as 8 frames, by reading back the decoded video to CPU and then immediately copying back to the GPU there. Have you done any work to access the decoded frame in libwebrtc while it is still in GPU memory?

How are you guys going to force Mobile Safari to enable HEVC and VP9 by default in WebRTC? I am of course joking, you cannot do that, but if Apple doesn't want to support X in browser, and Zoom's janky MPEG-over-datachannel does, and it's on a tiny screen so video quality barely matters, do you feel like also building all of Zoom's approach is on your roadmap?

1 comments

TURN is only one piece of WebRTC, and for the product we're building we want to be in control of our destiny on the AV side. Building on a fairly mature open source implementation of WebRTC was an early choice we made. We've already used this to build an international mesh architecture and few-to-many over WebRTC theater experience that can support thousands of viewers with very low latency.

The ML work we're doing is outside of video processing for other features of our platform, although we have incorporated some tools for features like background blur.

On mobile we've built native iOS and Android apps. Our goal in general is not to chase Zoom, but make a very different experience for distributed work. That said the AV has to work.

> Building on a fairly mature open source implementation of WebRTC was an early choice we made.

Hmm, you would know this best, having dealt with libwebrtc, that there's libwebrtc Chrome, libwebrtc Mobile Safari, and stuff that doesn't work. I'd say more, but if you say Pion 3-times, the Pion maintainer comes out in the comments and casts a spell on you and your jitter buffers become 10,000ms long.

> used this to build an international mesh architecture

AFAIK, Twilio, Amazon, Azure and Steam operate the only at-scale private-routing-over-TURN aka network-traversal service service for others to use, and Twilio and Amazon are the same network. Twilio doesn't even bother with Global Accelerators (aka anycast IPs), they route you via DNS responses, I doubt they've updated the code for years. Do you guys have your own private network? Surely it's "Amazon."

I suppose if you did that work, well you probably don't need 90% of the WebRTC featureset anyway, you might as well centralize it, which is what all the big chat vendors end up doing.

Anyway, it is extremely hard to innovate in this space, it's a lot of mashing together open source libraries and doing IT drudgery. Sometimes that IT drudgery is gluing C++ code together; sometimes it's deploying onto 15 AWS regions. I really appreciate the complexity of what you're doing and it all looks very cool.