| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tzs 2271 days ago

Note: what follows is probably not how anyone actually does it. It is just an illustration that adaptive video is not incompatible with E2E encryption.

Suppose you have a block of 4 pixels, represented by 4 24-bit values. Instead of sending the 4 pixel values, send one 24-bit value that is the average of all 4 pixel values, and then the actual 24-bit values for 3 of the 4 pixels. The receiver can figure out the 4th pixel from those 3 and the average.

Send the average values and the groups of 3 discrete pixel values in logically separate streams, separately encrypted.

If something transporting this needs to lower the bandwidth, it can just drop the E2E discrete pixel stream, leaving just the E2E average stream. The receiver can then use that average value for all 4 pixels, in effect getting a video that is 1/2 the resolution both horizontally and vertically.

This scheme only gives you two rates: Full resolution and 1/2 x 1/2. No doubt you could do systems based on block sizes other than 2x2, and with multiple levels of averaging, that would give a wider range of fall backs.

Actual state of the art video encoding is, I believe, based on things like the discrete cosine transform, which represents an image as a sum of cosines of various various frequencies.

In this kind of representation the higher frequencies correspond to higher resolution detail in the image. I'd expect that you could do an E2E transmission scheme were you have different encrypted streams for different frequency ranges. Like with my far less sophisticated or clever 2x2 averaging scheme above, you could simply drop the streams for higher frequencies and the receiver would be able to reconstruct a lower resolution image, but unlike my 2x2 averaging scheme this would have much finer drops in resolution.

1 comments

ric2b 2270 days ago

> Suppose you have a block of 4 pixels, represented by 4 24-bit values. Instead of sending the 4 pixel values, send one 24-bit value that is the average of all 4 pixel values, and then the actual 24-bit values for 3 of the 4 pixels.

So you still send 4*24 bits? what's the point?

> If something transporting this needs to lower the bandwidth, it can just drop the E2E discrete pixel stream, leaving just the E2E average stream. The receiver can then use that average value for all 4 pixels, in effect getting a video that is 1/2 the resolution both horizontally and vertically.

But you need knowledge of this protocol, so the sender is the only one able to do this. In that case just encode the downsampled resolution and send that, no tricks needed.

tzs 2270 days ago

The way these video meeting services work is the participants all connect to the service's servers. Each participant sends their video feed to the server, which sends it on to the other participants in the meeting.

It's that server that wants to be able to dynamically downgrade outgoing feeds based on the bandwidth between it and the meeting participants, which can vary from participant to participant.

Alice, for example, might be on a symmetric gig fiber connection with consistent and low latency. Her client can send a high resolution feed to the server. Bob might have no trouble with receiving that, but Carol might be on slower, less stable connection, and need a lower resolution version.

If you aren't trying to do E2E encryption, you can handle this by having the server deal with taking the high resolution feed from Alice and generating a low resolution feed and then sending the other participants whichever is the best version they can handle. That works because without E2E encryption the server actually has access to the video, so it can do things like resample and re-encode.

If you are using E2E though then the only parties that should have access to the video itself are the meeting participants. The server should not have access to the video except in encrypted form.

The problem then is how to encode and encrypt a video stream in such a way that a server that is copying that stream between a sender and one or more recipients can alter a copy of the stream in such a way as to reduce the resolution even though it does not have access to unencrypted video?