| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by btown 2928 days ago

Software for scaling up live-streaming CDN points of presence (POPs) is a pretty crazy domain. For on-demand video, you can think of a CDN as a cache, getting known-ahead-of-time chunks. But what about for live streaming? It's not feasible to stream frame-by-frame directly from your encoding backend to all the viewers of the World Cup, over something like RMTP - you'd want to use a CDN. So typically, you distribute meaty (multi-second) HLS segments as individual video files, or collections of files, to your CDN; once available, they then need to be requested by browsers/mobile clients as a whole segment, over HTTP(S). Works well with existing CDN infrastructure (provided they can handle the write volume and have big enough inbound pipes)... but the huge issue is that the length of the segment plus round-trips is a lower bound on effective latency. And when interactivity is required, multi-second delays can be horrible.

https://www.wowza.com/blog/hls-latency-sucks-but-heres-how-t... is a great writeup. Another overview of the problem, and a proposed solution, is in this excellent article by Twitter here:

https://medium.com/@periscopecode/introducing-lhls-media-str...

> In HLS live streaming, for instance, the succession of media frames arriving from the broadcaster is normally aggregated into TS segments that are each a few seconds long. Only when a segment is complete can a URL for the segment be added to a live media playlist. The latency issue is that by the time a segment is completed, the first frame in the segment is as old as the segment duration... By using chunked transfer coding, on the other hand, the client can request the yet-to-be completed segment and begin receiving the segment’s frames as soon as the server receives them from the broadcaster.

And Twitch's followup challenge:

https://blog.twitch.tv/twitch-invites-you-to-take-on-the-icm...

> This Grand Challenge is to call for signal-processing/machine-learning algorithms that can effectively estimate download bandwidth based on the noisy samples of chunked-based download throughput.

(IMO) If you're thinking that this is all rather silly, and that live video streaming is not something that should be done over HTTP in the first place... there are a lot of reasons why this is the case. All the CDN POPs are optimized for HTTP GET requests rather than stateful sessions, and Apple's smiting of Flash removed a lot of incentive for innovation on RTMP servers. The ironic thing is that Internet connectivity is fast/reliable enough nowadays that RTMP might have been able to escape its association with "buffering" spinners, and would provide a much lower-latency experience. Hopefully there's better standardization in the future as live video becomes more mainstream.