Hacker News new | ask | show | jobs
by kwantam 887 days ago
Yes, a correct encryption algorithm can encrypt (essentially) any bit string. But it's quite easy to turn a correct encryption algorithm into an incorrect one by bolting on something seemingly innocuous.

Here's a concrete example. Let's say you decide you want to make AES encryption more efficient by defining a new standard, "lzAES", that is just:

    Enc_lz(key, msg) := AES-Encrypt(key, lzw_compress(msg))
    Dec_lz(key, ctxt) := lzw_decompress(AES-Decrypt(key, ctxt))
This "works", in the sense that you can correctly decrypt ciphertexts, and it certainly seems innocuous. But it is now an insecure cipher!

Here's why: the definition of a secure cipher is that ciphertexts resulting from the encryptions of any two messages of equal length are indistinguishable. In other words, for any two messages you can choose, if I encrypt both messages under a randomly generated key, you can't tell which ciphertext corresponds to which message. In contrast, lzAES as defined above does not have this property: you could choose one message that's easily compressible (say, a string of zeros) and one that's not (say, a uniformly random string), and then you'd be able to tell which ciphertext corresponds to which plaintext just by looking at the ciphertext lengths.

And this is not just a definitional issue! If you use lzAES to encrypt something, an attacker can guess your message and test whether compressing the guess gives the same length as your ciphertext. Guess-and-check isn't possible with a secure cipher, but it is with lzAES---in other words, it gives away information about your plaintext!

1 comments

Thank you for taking the time to explain, your explanation is clear and gives me something to think about. But I have a question about:

> But it's quite easy to turn a correct encryption algorithm into an incorrect one by bolting on something seemingly innocuous.

Isn't every poorly designed web app essentially a giant "bolt on" to the encryption algorithm (HTTPS, etc) it is served through?

If there's a get_thread API, which zips the comment thread, includes it in some JSON as base64 along with other metadata (only the thread itself is zipped), and then sends that as the response over HTTPS, is that not secure? Nobody would bat an eye at this scenario, but it's essentially the same as your example because the plaintext is compressed before encrypting and sending. If it's okay to do this for a web app, why is it not okay to do it as part of a home-made RSA implementation.

(Of course, I'm not actually arguing for a second layer of encryption because it is unnecessary. But my understanding is that it wouldn't cause any harm and I'm trying to understand if that's correct or not.)

The example you give is similar to but not quite the same as "lzAES". The distinction is that in your example, the application is deciding whether to compress or not---the input/output behavior of the cipher doesn't include the compression step, so the cipher it self doesn't suffer from the problem I mentioned in my first note.

But it's still possible for an application to use a cipher incorrectly. In particular, an application-level decision about whether to compress some data before encrypting can have an effect on the application's security. In the case you mention it seems unlikely to be a problem (but that's an application-level question, so it could be).

As an example where it seems like the application-level decision to compress or not matters a lot, imagine an application that sends an encrypted password to a server. If the application compresses the password first, an attacker could learn which values are not my password via guess-and-check-length. (Of course, even without compression the attacker can learn something about the length of my password just by looking at the length of the ciphertext---so probably this is a case where the application should first pad the message to some fixed length before encrypting. But in any case it almost certainly shouldn't compress the password!)

You should look into the CRIME, BEAST, and BREACH attacks on TLS/SSL. They're related to using compression before encryption.

The TL;DR is that you generally should not compress secret data before encrypting, especially if part of the request might be reflected in the response.

If you look carefully at your browser's HTTPS traffic, you'll notice that dynamic data is never sent using HTTP compression, though static (basically guaranteed to not contain anything secret) data might still use it.