Hacker News new | ask | show | jobs
by spongebobstoes 657 days ago
TLDR, there are a lot of moving pieces, but people are working on it at the moment. I try to summarize below what some of the challenges are

Bandwidth requirements are a big one. For broadcasts you want your assets to be cacheable in CDN and on device, and without custom edge + client code + custom media package, that means traditional urls which each contain a short (eg 2s) mp4 segment of the stream.

The container format used is typically mp4, and you cannot write the mp4 metadata without knowing the size of each frame, which you don't know until encoding finishes. Let's call this "segment packaging latency".

To avoid this, it's necessary to use (typically invent) a new protocol other than DASH/HLS + mp4. Also need cache logic on the CDN to handle this new format.

For smooth playback without interruptions, devices want to buffer as much as possible, especially for unreliable connections. Let's call this "playback buffer latency".

Playback buffer latency can be minimized by writing a custom playback client, it's just a lot of work.

Then there is the ABR part, where there is a manifest being fetches that contains a list of all available bitrates. This needs to be updated, devices need to fetch it and then fetch the next content. Let's call this "manifest rtt latency".

Lastly (?) there is the latency from video encoding itself. For the most efficient encoding / highest quality, B-frames should be used. But those are "lookahead" frames, and a typical 3 frame lookahead already adds ~50 ms at 60fps. Not to mention the milliseconds spent doing the encoding calculations themselves.

Big players are rewriting large parts of the stack to have lower latency, including inventing new protocols other than DASH/HLS for streaming, to avoid the manifest RTT latency hit.

1 comments

For HLS you can use mpeg ts, but mp4 is also an option (with the problem you talk about).

IMO one of the issues is that transcoding to lower resolutions usually happens on the server side. That takes time. If the client transcoded that latency would go away (mostly).