Hacker News new | ask | show | jobs
by tmpmov 2994 days ago
I skimmed the paper. Find my hand-wavey explanation below. I think I understand the overview, though if you see a mistake/misconception/error let me know.

Suppose Alice and Bob want to "secretly" communicate a message, we'll call the message LL for love-letter, but the government mandates an encryption scheme, we'll call it GE for government encryption. This encryption scheme allows the government to decrypt your correspondence as they know the keys that are necessary for its use.

Suppose also that GE is "secure": only Alice, Bob, and the government can decrypt an encrypted message. We consider this secure as the government already has the keys and hasn't fundamentally broken the algorithm.

The paper proposes that Alice and Bob can still communicate their LL and that the government won't understand it, even over a link that the government can decrypt. Further, the conversation between Alice and Bob will be encoded in such a way that it does not appear to the government that encryption on top of the mandated GE is taking place. In addition, even if the government knew that the conversation had subliminal meaning it wouldn't be capable of decrypting it/understanding it: the LL was encrypted using a method as hard as the GE. Thus the paper gives a good security guarantee for the subliminal messages.

I thought the important point they make in the paper is that the proposed scheme should not generate messages that are clearly encrypted: the government should see a normal conversation between Alice and Bob unrelated to the LL. They reference steganography as an example of sending encrypted information that doesn't appear to be encrypted (e.g. via a message encoding in the color bits of a picture such that the picture looks the same before and after encoding your message into it). We'll call this the normality constraint (NC). They then give an impossibility result for something known as local decoding. Metaphorically, I think local decoding translates to using the same picture to carry a conversation stegenographically. So using your favorite meme picture repeatedly for the stegenographic conversation isn't secure (again, my understanding is shallow here and the constraints may be even stricter). It would appear though that you could randomly select from a meme archive to have your private stegongraphic-like conversation (treating the pictures as strings of bits and randomly selecting among them for the purposes of the proposed algorithm, I explain more below)

The NC motivates them to use a probabilistic function over a set of strings as the alphabet/symbols of their encryption.

I think this means that the government would see a conversation that is syntactically correct and not outrageously meaningless, though I didn't see this quantified/addressed directly in the paper (correct me if I'm wrong).

Thus, as I didn't see the semantics of the encrypted subliminal conversation addressed I felt motivated to attempt a quick answer myself (again my shallow understanding of the paper may have overlooked how they addressed this).

Specifically: Wouldn't it become obvious if the government saw a syntactically wrong and semantically meaningless conversation?

My hair-brained scheme to address the question goes like this:

Use phrase indexes or phrasal pairs. A phrase index would be a string like "hey, how are you?". A phrasal pair might be pairs of related phrasal indexes. I could see this being mechanically generated via machine learning. Although this solution has its own pitfalls as you can probably guess and would need further fleshing out.

The semantic authenticity required by the NC seems to be hard to satisfy based on my reading. Would any care to enlighten me?

TLDR: You can formally encrypt conversations using strings (or language phrases) such that even if they're communicated over a "compromised" channel, your encrypted conversation retains its confidentiality assuming that the compromised channel hasn't been fundamentally broken (e.g. the snooping party has the keys to the channel, not a universal lock pick that breaks the algorithm).

1 comments

If I understood correctly, as long as Alice and Bob can arrange some meeting and agree to some method, they'll be able to communicate over any channel, be it encrypted or not.

Example: Alice and Bob agree that the 5th letter of the n-th phrase from a conversation will mean '1' if it is the set ones={A, C , E} and '0'={B, D, F}. And that the set will be generated based on some characteristic from some headline from a newspaper.

That will be secure even if used over the GE-encryption method from the government, and won't attract attention, except if someone knows how to deal with it.

Isn't that? So, a kind of steganography: hide information in plain sight, as long as it was agreed before and was not leaked.

The paper targets an adversarially selected encryption scheme. Thus perhaps any would work, though I sort of doubt this based on my reading. In addition, the paper supposes that the two parties should be able to communicate using a normal-ish public/private key method.

Based on the quotes below I belive paper relies explicitly on the GE channel's hardness to break:

"On Our Modeling Assumptions. Our model considers a relatively powerful adversary that, for example, has the ability to choose the encryption scheme using which all parties must communicate, and to decrypt all such communications. We believe that this can be very realistic in certain scenarios, but it is also important to note the limitations that our model places on the adversary.

The most obvious limitation is that the encryption scheme chosen by the adversary must be semantically secure (against third parties that do not have the ability to decrypt)."

Later:

"All known constructions of such undetectable random string embedding rely on the sampling of a public random seed after the adversarial strategy is fixed. In this paper, however, we are interested in bootstrapping hidden communications from the very ground up, and we are not willing to assume that the parties start from a state where such a seed is already present."

" We begin with the following simple idea: for each consecutive pair of ciphertexts c and c0, a single hidden (random) bit b is defined by b = f(c, c0) where f is some two-source extractor. It is initially unclear why this should work because (1) c and c0 are encryptions of messages m and m0 which are potentially dependent, and two-source extractors are not guaranteed to work without independence; ..."

"We overcome difficulty (1) by relying on the semantic security of the ciphertexts of the adversarially chosen encryption scheme. Paradoxically, even though the adversary knows the decryption key, we exploit the fact that semantic security still holds against the extractor, which does not have the decryption key."

This is correct: if both parties have some pre-agreed secret, then it is much easier to secretly communicate over any channel and the method you are describing works and is similar in spirit to standard steganographic methods.

However, deriving the "secret" from "some headline from a newspaper" will not work because the adversary can also perform the same derivation and will be able to extract the secret conversation as easily as Alice or Bob.