Hacker News new | ask | show | jobs
by ciderpunx 4724 days ago
Yes, I think you might be correct and I do note that my approach is not to be trusted prominently ;-)

I am interested to know what attacks would be possible.

My guess is that you'd look for known sentences -- which you can mitigate by using a custom corpus.

Or you do some sort of statistical analysis on the length of sentences, which is mitigated by distributing the sentence lengths along a standard distribution.

Or you do a statistical analysis for word lengths themselves. But if the data you are hiding is GPGed then this information is not obviously vulnerable to statistical analysis of this type because the character distribution ought to be even (ish).

I suppose you would mitigate against attacks on the length of the messages by splitting your message and sending from multiple accounts.

Are there other attacks that I've missed? I'd love to know.

And I'll check out Kahn's book, thanks for the suggestion.