TCP deals with congestion and bandwidth allocation automatically. Decades of algorithms work in your favor if you use TCP.
A single missing packet will seriously mess up your video stream. Remember that long sequences of movie pictures are sent as a single whole image plus delta instructions to generate the rest of the images. A single bad packet can screw up a second of video.
There's a good explanation in http://www.dtc.umn.edu/~odlyzko/doc/net.neutrality.delusions... (2008) section 3 which explains "why faster-than-real-time progressive downloads of music or video are far preferable to real-time streaming. But first let us consider the strange situation in which this issue is not discussed publicly, and the advocates of each of the two types of video transmission mostly seem unaware there is a real alternative to their preferred solution, and that there is a serious decision that has to be made for each video service."
A few missing packets may not matter, you're right. MPEG has a lot of error checking and error handling built into the transport stream layer.
But I think the bigger reason Netflix uses TCP is because TCP has a much easier time with NAT traversal and not needing port forwarding. By looking and acting like normal web traffic it makes it easier to guarantee that things will just work rather than need any special accommodation.
UDP does not give any significant benefits to TCP when latency is not as important (e.g. playing previously recorded movies as opposed live streaming).
In fact UDP makes things much harder, for example unlike TCP with UDP you need to worry about the rate at which you're sending the data. Too slow and you're not utilizing the full bandwidth, too fast and packets are getting dropped because the client can't receive it fast enough.
If you don't have a real good reason for UDP you should stick with TCP otherwise be prepared to re-implement your own congestion/rate control, and unless you're familiar with the subject it might be a challenge.
IMO an good example of unnecessarily using UDP is Etsy's statsd. Their argument is that it's ok when data is lost so no need for TCP. Unfortunately with their approach when there's enough of traffic on the LAN to cause router to drop packets, there's not much traffic left for other data as well. If they instead would use TCP with (optionally) low send/receive buffer and use O_NONBLOCK socket. When sending data if send() returns EAGAIN, simply ignore it. This approach still causes data being dropped when there's too much of it, but at the same time it minimizes amount of data in the network and plays nice with other services sharing the same routers.
TCP is more efficient than UDP if you don't need low-latency delivery, because you can compress more data at once and waiting for retransmitted packets isn't a problem. This also ensures you get a high quality stream through the viewing experience unless there are major congestion problems.
A video chat will always prefer UDP transmission for the lowest latency possible, but quality may vary and bandwidth usage can technically be higher.
"TCP is more efficient than UDP if you don't need low-latency delivery, because you can compress more data at once and waiting for retransmitted packets isn't a problem."
Would you care to elaborate what you mean by any of those things? I happen to have some background in protocols but I honestly can't make sense of any part of that sentence.
If you know that the receiver is going to receive all of the data that you send (or at least a lot of it at once, for example if you have a 5-second buffer), then you can more efficiently compress all of the media being sent.
Typically, a UDP-based video stream will have less dependency among the packets, so that even if a packet is dropped, the next packet is still decodable. This means that there is more redundant information per-packet as a durability measure.
UDP-based streams also often have something called forward error correction (FEC). This is where you encode lower quality versions of media samples in subsequent packets. Again this is trading more bandwidth for realtime durability. If you miss a packet, then the next packet probably has the same media sample in a lower quality, and your 100ms jitter buffer gives you just enough time to make use of it. This is far more time-efficient than requiring the receiver to ask the sender to retransmit a missed packet.
To the point about retransmission: in UDP cases, it's often not worth bothering about. By the time you round-trip the request, you've probably already received more recent media.
I should say that this isn't about TCP vs UDP so much as it is about buffered vs realtime streaming.
"If you know that the receiver is going to receive all of the data that you send (or at least a lot of it at once, for example if you have a 5-second buffer), then you can more efficiently compress all of the media being sent."
That has nothing to do with the underlying transport.
"Typically, a UDP-based video stream will have...""UDP-based streams also often have something...""in UDP cases, it's often not worth bothering about... "
Here you're describing a bunch of possible properties a hypothetical protocol written on top of unreliable datagrams would/should have.
"I should say that this isn't about TCP vs UDP so much as..."
This part is correct. You should keep this part.
"..it is about buffered vs realtime streaming."
Right. Let's not conflate transport protocols with media streaming methodologies please.
Every site streaming HTML5 video is using TCP including YouTube. Traffic engineering appears to have reduced the argument for UDP for media in recent years, which is a good thing!
In addition to the fact that even a single missing packet is actually a "big deal" for compressed video streams, the assumption that the number of missing packets would be "only a few" is erroneous. I've seen situations where packetloss was as high as 60%, and you wouldn't want your video stream to turn to garbage or have your service and app spending all their efforts dealing with that rather than letting equipment just do its job.
A single missing packet will seriously mess up your video stream. Remember that long sequences of movie pictures are sent as a single whole image plus delta instructions to generate the rest of the images. A single bad packet can screw up a second of video.