Low latency HLS is creating partial segments by bucketing 200ms of frames instead of 6s segments in standard HLS. Whereas in webrtc, the endpoint is sending the frame as soon as it is ready.
The apples apples comparison here is 0ms (in webrtc, no send side buffering) vs 200ms (in low latency HLS) or 6s (in standard HLS). This is independent of latency of the endpoint from CDN or source.
Another distinction is playback wait time, i.e., how quickly upon joining can it start rendering video.
I’m assuming the full reference picture (typically, an IFrame or a golden frame depending on the codec) in low latency HLS is only available at the start of each 6s segment and not in partial segments. So joining a live stream, the receiving endpoint would have to wait at most 6s before rendering.
Similarly in webrtc, it’s up to the system to generate a reference frame at regular intervals, as low as every second. Or to do it reactively, a receiving endpoint can ask the sender to send a new reference picture. This is done via a Full intra request, the wait time can be as quick as 1.5 times of the round trip time (as new codecs can generate a new iframe instantaneously upon receiving a request). There’s a slight cpu penalty for this which means that the sender getting too many full intra requests may typically throttle the response to 1s.
So Apples to Apples comparison for wait time would be up to 1s for webrtc vs 6s for HLS.
You don't necessarily need to wait for a reference picture to start playback. Modern codecs all support "intra-refresh", which allows you to reconstruct a reference frame from a set of existing frames. With that, you can set periodic intra refresh much lower than 6s keyframe intervals.
An HLS segment can carry any number of GOPs. A GOP length of FPS/2 or FPS/4 will get you an I-frame pretty quickly allowing the GOP to be decoded. MPEG-DASH can do the same IIRC. So there doesn't need to be a segment length delay in playback and typically is not.
In addition to what vr000m said above, I'll just add that when you make HLS chunks smaller, you're reducing the leverage you get from HLS's core design decisions. I tried to cover some of this in the post.
One way to think about this intuitively is that HLS and WebRTC are opposite ends of one important trade-offs axis.
HLS is about delivering media streams in a way that scales as cost-effectively as possible.
WebRTC is about delivering media frames at the lowest possible latency.
These are very different goals, and given current infrastructure and standards it's not possible to have your cake and eat it too, here. That may change in the future as low-latency video becomes more and more important. QUIC, for example, is a new approaches to building out a full stack that works around some of the fundamental tradeoffs that exist today.
The result is that pushing HLS segments down to 200ms is not at all a clear win. We'll see what happens as HLS implementations improve. And I should say that my brain has been warped by working on UDP/RTP stuff for a long time. But my bet is that using 200ms HLS segments is, for most real-world users, going to make HLS worse in every way than WebRTC would be, for the same use cases. (That's definitely true today with the early implementations of LLHLS.)
I appreciated your post so thank you. I would be more interested in understanding why you don't see <1s HLS chunk sizes as working in most cases? I feel like p99 real world latency stuff would show some natural buffer sizes.
The smaller the chunk (file) sizes, the less benefit you're getting from pushing the chunks through a CDN. More requests will come back to your origin server. And there's a lot of complexity in the pipeline from encoder -> origin server -> CDN that's mostly hidden (which is good) until you hit big performance cliffs (which is bad).
This is something that I think most people trying to implement low latency HLS and DASH have struggled with. It's not only the connection to the client that can stall. Your CDN can stall internally, too, waiting on chunks. And, in fact, if your CDN never ever has any internal performance issues, that's probably an indication that it's configured in such a way as to make the costs of delivering video through the CDN pretty much the same as the costs of delivering that same data through cascading WebRTC media servers!
Also, the CDN -> client link is TCP, so you're giving up the ability to just drop packets. TCP is going to do its lossless/ordered thing for you, which again is great for most of what we do on the Internet but starts to actively work against you when you're trying to get down to very low latencies.
Does that make sense? I tried to cover some of this in the footnotes. Apologies for not doing a better job.
this will be the winner purely because Apple devices are the lowest common denominator. That is, one must support iOS and if iOS only supports LL-HLS but other devices optionally support LL-HLS then LL-HLS is the winner.
The apples apples comparison here is 0ms (in webrtc, no send side buffering) vs 200ms (in low latency HLS) or 6s (in standard HLS). This is independent of latency of the endpoint from CDN or source.
Another distinction is playback wait time, i.e., how quickly upon joining can it start rendering video.
I’m assuming the full reference picture (typically, an IFrame or a golden frame depending on the codec) in low latency HLS is only available at the start of each 6s segment and not in partial segments. So joining a live stream, the receiving endpoint would have to wait at most 6s before rendering.
Similarly in webrtc, it’s up to the system to generate a reference frame at regular intervals, as low as every second. Or to do it reactively, a receiving endpoint can ask the sender to send a new reference picture. This is done via a Full intra request, the wait time can be as quick as 1.5 times of the round trip time (as new codecs can generate a new iframe instantaneously upon receiving a request). There’s a slight cpu penalty for this which means that the sender getting too many full intra requests may typically throttle the response to 1s.
So Apples to Apples comparison for wait time would be up to 1s for webrtc vs 6s for HLS.