If you have money (and they did) it’s actually not too difficult to solve. There are services out there to scale both video streaming[1] and websockets[2] for you.
Ultra low latency that allows real-time interaction is really hard to scale. I have built an HQ clone and the only right way in terms of latency seemed using WebRTC.