Hacker News new | ask | show | jobs
by sandov 2244 days ago
Tangential (and noob) question: Could there be an encryption+compression scheme where:

1. Sender sends an encrypted stream at K bps.

2. Server takes the encrypted stream and compresses it, without decrypting anything. It then sends encrypted+compressed stream at J bps (J<K) to end recipient.

3. The end recipient decrypts the compressed stream using a key provided by the original sender, not the server.

This with reasonably secure encryption and reasonably size-efficient codecs, obviously. So the step 2 would add compression additional to the compression of the original codec.

Is this mathematically possible?

6 comments

If the data is encrypted well, it should be indistinguishable from random noise.

You can't compress random noise.

Ergo, you can't compress encrypted data. It's a nice idea though.

Generally it's not possible to compress well encrypted data with any meaningful scheme, but if your encryption is stream cipher based, you can subsample a stream of PCM audio on the server quite easily, by just removing, say every second byte (or every second pair of bytes). Then decrypting entities can insert dummy data at every second byte position, xor with the encryption stream and discard the data again, replacing it with interpolated data instead. Probably the same can be done for uncompressed image data, but any kind of serious compression wouldn't be possible and this simple compression can already be done on the clients without much overhead. So I guess it's mostly of theoretical use.

You would rather have each client send streams in multiple resolutions, say one in 1080p and one in 480p, and then make the central server decide which stream to send to which client. Taking one step further, the clients can be asked to adjust the quality of the stream they are sending depending on which streams they need. There are obvious latency concerns, but there are also bandwidth savings in having less data being uploaded by clients. Actually the bandwidth use is highest if the client uploads an uncompressed or barely compressed unencrypted stream for the server to compress. It's much better if the client's hardware did the compressing part.

Usually you compress first then encrypt. Unencrypted data has better compression ratio.

Also modern video data is already compressed where most frames are difference data.

Generally speaking, video compression algorithms are 'lossy' (cause a reduction in quality) and need to be able to 'see' the video in order to compress it. The compression typically removes details that are not perceivable by humans (this is also the case for image and audio compression), and by sending just the changes from frame to frame, rather than each entire frame. For both of these, the compression algorithm needs knowledge of the video stream, which would be impossible if it's compressed.

You could try to use a lossless compression algorithm (such as used by zip), but those are effectively useless on video in general, and even more so on encrypted data, which appears to be random.

Thanks for your answer.

> the compression algorithm needs knowledge of the video stream, which would be impossible if it's compressed.

That makes perfect sense and I guess my theory of secure encryption + post-re-compression fails because of this. But what if we didn't need perfectly secure encryption, but just per-block encryption. So the server knows that you've sent 60 frames, but doesn't know what is in those frames.

What if the codec was made in such a way that the server knew that certain blocks in the stream could be discarded to reduce size while the stream without those blocks still makes sense to the recipient.

For example: the stream is composed of 64 byte blocks, but the codec says that every 2 blocks there's a discardable block that adds image quality but is not essential. So, with this knowledge, the server discards every 2 blocks when sending that data to people with low bandwidth and sends the original stream with all its blocks to those with high bandwidth.

It's an extremely naive scheme, but maybe this principle could be applied to more complicated codecs, so the server only needs to know metadata about the stream (where each block is and whether it's essential), but not the content of the block itself (framebuffer and audio sample values).

I'm sorry if this idea is too dumb (and my English skills are not the best).

Interesting; some time ago the Ogg audio transport was working on bitrate peeling https://en.wikipedia.org/wiki/Bitrate_peeling with the basic idea that a stream can be encoded at one bitrate but can be served at that or any lower bitrate. A simpler example is FM stereo radio - the main frequency provides a mono audio signal, which works just fine, but if the receiver can also pick up the stereo sub-frequency (containing just the diff of left and right channels), it gets stereo.

Anyways, the wikipedia page linked above links to this about the same concept, for video: https://en.wikipedia.org/wiki/Scalable_Video_Coding so it might be feasible. Not sure how feasible/secure encrypting this would be.

Naively no, but I wonder if there is some variant of homomorphic encryption that does this? I would be surprised though if this was efficiently possible.
Yes, I think homomorphic encryption is what I was thinking about, although I didn't know its name until now.
You can't compress properly encrypted data.