|
|
|
|
|
by robfig
993 days ago
|
|
We developed this for Roam (https://ro.am/ [1]). Roam is a virtual office environment for real-time collaboration - audio, video, whiteboards, personal offices, team rooms, theaters and group chat with ML-enabled tools layered on top. It's based on Chromium & WebRTC so that we can ship a cross-platform app (Electron) as well as a nearly-parity Web client with a tiny team. We've had a good experience with this approach, although it is far from plug-and-play. We do have to identity and patch items in Chromium/WebRTC & the server (Pion) interaction to get our video quality up to compete with Zoom/Teams/Meet. We are able to effectively compare video quality across these providers and expect to reach their video quality with this stack. Disclaimer: I work on Roam's Chat, AI, and API. [1] We are currently in closed beta so there's not much there at the moment. |
|
> ML-enabled tools layered on top.
In your opinion, have Encoded Transforms definitively settled the approach to this matter?
If I want to use free PyTorch garbage X on my camera stream Y, I can take a big hit on occupancy and latency, sometimes as high as 8 frames, by reading back the decoded video to CPU and then immediately copying back to the GPU there. Have you done any work to access the decoded frame in libwebrtc while it is still in GPU memory?
How are you guys going to force Mobile Safari to enable HEVC and VP9 by default in WebRTC? I am of course joking, you cannot do that, but if Apple doesn't want to support X in browser, and Zoom's janky MPEG-over-datachannel does, and it's on a tiny screen so video quality barely matters, do you feel like also building all of Zoom's approach is on your roadmap?