Hacker News new | ask | show | jobs
by garrettr_ 3308 days ago
A quick look at the source code shows the generate_key() function [0] to be insecure. It generates 32 random bytes (good, that's what you need for an AES-256 key), but then it uses those random bytes to sample from a distribution which only has 62 characters. This significantly reduces the security of the key, from 256 bits of entropy to ~190 bits (log2(62^32)). And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

I recommend reading Section 9.7 of Cryptography Engineering [1] to understand why choosing random elements from a set is harder than it seems. A good example of a similar bug is the nasty bug in Cryptocat's PRNG from 2013 [2].

I assume this step was done so the AES key could be included in the URL fragment, since a set of random bytes may not be url safe. I recommend feeding the random bytes of the key directly into the underlying cryptographic functions, and using a urlsafe encoding at a higher level when necessary.

Also, it appears you are using AES [3], a block cipher, but I cannot figure out what block cipher mode you are using. I'll have to dig into the CryptoJS code a little more to see what it defaults to, but I have a sinking feeling that it's ECB, which is completely insecure. Dan Boneh's Crypto I course on Coursera is a good way to learn the basics of block cipher modes.

[0]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b... [1]: https://www.amazon.com/Cryptography-Engineering-Principles-P... [2]: https://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pse... [3]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b...

2 comments

I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

It's using CBC mode.

[0] http://incoherency.co.uk/blog/stories/hardbin.html

EDIT:

> And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

Can you please point out how it's not? It's intended to sample uniformly. It would be non-uniform if it were "randombytes[i] % alphabet.length".

EDIT2:

I see now how it's non-uniform. 256 values in randombytes doesn't map 1:1 onto 62 values in alphabet. I will fix this tonight, thanks for pointing it out.

> I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

I understand that you're trying to balance the tradeoff between security and usability here, which is tricky. If quantum computers are part of your threat model, remember that Grover's algorithm provides a quadratic speedup for brute-forcing a symmetric key, so 2^190 would become 2^95 against a quantum adversary. Personally I prefer the margin of safety provided by using a full-strength 256-bit key :)

> It's using CBC mode.

Phew! That would've been truly catastrophic.

CBC mode isn't exactly a saving grace here, since it's unauthenticated.
The code and data are shipped together out of IPFS. If you don't trust the data, you don't trust the code anyway, so it makes no difference whether the data is authenticated.
First, read this: https://tonyarcieri.com/all-the-crypto-code-youve-ever-writt...

Second, what is the threat model where you trust IPFS but still need to encrypt client-side? Unauthenticated CBC mode totally defeats the point of encryption, but encryption totally defeats the point of trusting IPFS.

Why not just-- crazy idea!-- use authenticated encryption even if you trust IPFS?

I don't think you understand IPFS.

If you trust your IPFS node, you know that you're retrieving the correct content. You still don't want others to be able to read it.

EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.

Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.

You mean truly catastrophic if it was codebook?
> to understand why choosing random elements from a set is harder than it seems

There is no sample from a set problem involved, just convert b256 -> b62, there is a correct way to do this.