Let me rephrase the question. Assuming that in the video the red background is the network capacity and the lines are the video bitrate used, salsify seems to be using ~6000kbps and WebRTC ~2500kbps. Is this higher bitrate because of you're using VP8, or it a limitation of the protocol? If it's because of VP8 how hard would it be to adapt to modern codecs like HEVC and AV1?
I don't see anything in what they did that won't work with HEVC or AV1. They just hacked VP8 to be able to save and restore codec state per frame so they can generate multiple versions of the next frame, choosing smaller when network condition is bad. Their innovation is in preventing congestion rather than reacting to the aftermath of congestion.
Right, they may have used VP8 because VP9 and HEVC are more CPU intensive and they could technically encode more frames per second than the input 60 FPS.
Some is probably due the codec VP8 vs h264, but they are not wildly different in efficiency. WebRTC doesn't use VP9 or H.265 yet. I view the higher bandwidth used as an attempt to maximize quality at every bandwidth they just didn't set a ceiling on the encoding frame size. It pushes the total quality number higher on the chart they've published.
As a practical matter it isn't very, well, practical to have to change every codec to support this. Maybe in the next standards, but that's going to be hard.