Hacker News new | ask | show | jobs
by colanderman 1084 days ago
Ex1 is just excepting low-entropy packets (distribution of 1s and 0s tends toward the mean for high-entropy data). Encrypted data presents as high-entropy. This is a crude method (errs on the side of not excepting) but is very efficient for embedded hardware to compute.

Ex2-4 are just excepting ASCII text, which is used by many unencrypted protocols (e.g. IMAP), but which are high enough entropy that they statistically will fail the first test often.

Ex5 is necessary because TLS is high-entropy (by nature of being encrypted). HTTP is also excepted presumably so e.g. compressed uploads (e.g. images/video) aren't flagged.

That "low entropy" is the key to bypassing the GFW isn't surprising at all -- high entropy is all but a necessary feature of most cryptography schemes. (I say "all but" because -- encryption isn't adding information, so unless you compress before you encrypt, it's possible for a (hypothetical) encryption scheme to preserve entropy, according to several objective metrics. I don't know of any that do this, beside the meta scheme of compression before encrypting, followed by steganographically padding the encrypted data afterward. This of course leaks some information through the encryption -- equal to the negentropy of the message -- but it would typically be information that can't be gleaned from context, e.g. that the message is HTML+text.)

So... base64-encode your TLS?