Hacker News new | ask | show | jobs
by PavlovsCat 2545 days ago
> all the needed work to make the file "streamable", e.g. segmenting and encoding file parts

Pardon the probably stupid question, but is that (splitting into several files) really necessary? Isn't supporting HTTP range requests enough?

Or is it because clients tend to download the whole file, even when the listener is still at the very beginning and might skip, meaning wasted bandwidth?

2 comments

Not a stupid question at all! I actually looked into byte-range requests as opposed to downloading individual segments, indeed Amazon S3 does support byte-range requests. I would think it's similar to implement, and wouldn't have as many moving parts.

When I had started the project, I saw how other streaming sites fetched their audio, and I saw a pattern in individual GET requests for segments. I thought, "Hey, they're doing this for a reason, and even though I don't fully understand this reason, maybe I should do it to." At the time, I think I justified it by thinking that someone who wants to rip the stream file would have to piece the track together versus just requesting the full track.

In the end, it really doesn't matter: anyone who is willed enough can get whatever is being sent to their client. It's one of those moments where I didn't bother to really think it through and factor in the practicality aspect. I'd like to go back and re-visit byte-range requests, though, just to see how it would work differently.

MPEG-DASH and HLS both use segments. I think one reason is to allow for dynamic quality switching: if the bandwidth drops, the player can simply start pulling segments encoded at a lower bitrate. This is harder to do with byte-range, since bytes don't map to time cleanly.
As someone else who has built streaming things, HTTP range requests aren't enough all of the time.

Quite a number of older Android clients, like TVs, cheap phones, etc. don't have an up-to-date browser on them, and they will make the first range-request correctly (with a byte range that reflects the browser knows the full content-length, etc)...

But you'll have to build your own scheduling system to pull in further requests, which is painful, clumsy and doesn't always work.