Hacker News new | ask | show | jobs
by mgamache 2973 days ago
To me, the key innovation here is the tight integration between network conditions and codec frame size. Standard codecs are created with specific bandwidth requirements and they provide encoded frames that 'average' around that size. You could just re-initialize a codec at a lower bandwidth on the fly, but you would have to send an I frame (large full frame) to kickoff the new series of frames (as video most video frames are just updates of a previous frame). Having a codec accept a bandwidth target per frame is a really good idea.
2 comments

Codecs used by real time video systems are able to adjust the bitrate on the fly. There's not a keyframe request every time that happens unless resolution is changed. How quickly they adjust might vary, software implementations generally does it for the next encoded frame. The frame still will be somewhat larger or smaller than the target size since the codecs can't accurately predict the encoded size for given quality parameters.

The Salsify implementation in the paper has slightly more accurate way of producing one single frame as it encodes two frames with different quality targets and takes the largest one below the frame size target.

For a resolution change, couldn't you just scale the old last frame to the new resolution and use that as the basis for more P frames? (originally replied to the wrong comment)
Short answer: Yes.

Longer answer: The codec needs to support it. Codecs actually allow prediction from multiple reference frames, and maintain a buffer of them (2 to 16, depending on the codec, profile, and level). An individual frame may refer to several (potentially all) of those. So re-scaling up to 16 frames for every frame you decode will get quite expensive, not to mention the generational losses of doing this repeatedly for every resolution change. In practice what happens is you scale individual blocks when they get referenced by the current frame. But that has to be integrated into the motion compensation routines of the codec.

Both VP9 and AV1 support this, for example.

Modern codecs have a rolling I frames to spread the cost over many other frames to avoid the spike in bandwidth needs.