Hacker News new | ask | show | jobs
by ComputerGuru 2049 days ago
This sounds like an MTU issue. TCP takes care of mere (eg probabilistic) packet loss ok. MTU issues have actually crept back up because TLS exacerbates any underlying MTU problems. IPv6 doubly so (when any hops - especially yours - don’t follow path MTU detection requirements).
5 comments

TCP doesn't take care of packet loss. What TCP does is make sure your packets are not lost, even if you have 99% packet loss. On the flip-side, that means that if TCP can't deliver a single packet (say out of a billion), the whole stream stops at this one packet...

Which is why TCP is a horrible choice for any streaming service and a horrible choice for lossy connections, and I would be quite surprised if Netflix relied on it. UDP is the perfect choice for streaming, since video decoders can handle packet loss pretty well. The rest you can achieve with good tradeoff between Reed-Solomon codes and key framing.

I can't find any solid source for it, but I think most web video streams are TCP:

https://news.ycombinator.com/item?id=8638946

Even the live ones like Twitch.

Because they all want to run through HTML5 web browsers, re-use the same TLS as everyone else, and not write a ton of new code.

When QUIC gets big, they'll probably switch to UDP - Not cause it's better on every connection, but because it will be popular and it will be better on lossy connections. But for now TCP does work fine.

That's why youtube-dl can rip video without implementing tons of weird proprietary protocols - It's just HTTPS. Otherwise these video sites wouldn't run at all in Firefox.

I'm not sure this statement is generally true for Netflix's use case.

UDP provides no out of order packet handling which _needs_ to be handled for video streaming. UDP is by default unbuffered throughout transport and tends to cause greater stress to client systems since they need to respond per packet rather than per traffic stream (IP+port combo). As a client developer, you end up reimplementing 90-95% of what TCP gives you out of the box at great development and QA cost. You also drain battery on mobile devices with all the interrupts your causing doing UDP. The upside with a UDP-based implementation is the latency from server to client display is usually much less (tens of milliseconds vs hundreds to thousands), but the trade-offs involved are almost never worth it for a static media streaming site like Netflix.

Even dynamic media streaming sites like Twitch rarely dip into UDP server-client implementations unless there are some unusual requirements.

You'd only allow packet loss without re-transmission (e.g. pure UDP) if you really need low latency, like for a video call.

Netflix is pure TCP I'm sure - look up HLS and DASH.

Aren’t MTU issues typically only up to a router? As in, even if the parent had a different MTU than Netflix uses, it wouldn’t matter since their router or the ISP’s router will transform packets between their appropriate MTUs?

And if this is true, then how could it be that Amazon works without problem and Netflix doesn’t?

"how could it be that Amazon works without problem and Netflix doesn’t"

Supporting Path MTU discovery (PMTUD), or perhaps just capping their outbound packets to 1450 or similar. Cloudflare found and fixed a problem in this space: https://blog.cloudflare.com/path-mtu-discovery-in-practice/

Oh wow, TIL about the “don’t fragment” bit and all the stuff that comes with it.

Thanks for sharing, I learned a lot from that blog post.

It's not unusual for a server to also be a router in a layer 3 link aggregation setup. It's extremely common for IPs to be load-shared amongst servers using ECMP. If each server is connected to 2 Top-of-rack (TOR) switches and advertises the route to the shared IP through both TORs, you can very easily have ICMP probes used for PMTU take the wrong route and be dropped. The result is a TCP session with a default MTU that may not work along all traversed paths and will suffer from fragmentation.
>TCP takes care of mere (eg probabilistic) packet loss ok.

I'd imagine this is largely due to MSS clamping rather than actual MTU caused packet loss.

Isn’t streaming done usually via UDP?
It’s typically all HTTP requests; nowadays with HTTP3 we are back to using UDP, but apart from real-time video conferencing etc I don’t believe many streaming services use anything other than HTTP.
HTTP over TCP to cache nodes.

Fire up the developer tools / network view and go watch a Netflix video; try pausing, etc. It is incredibly straightforward.

... no it doesn't. Like not even close.