Hacker News new | ask | show | jobs
by pclmulqdq 497 days ago
They didn't want a KDF, as far as I know, but they wanted a hash function with unlimited input size.

Including the username in the hash input gives you guaranteed domain separation between users that you don't get from salts/nonces. Its a generally good idea if you have a hash function with unlimited input size (all modern cryptographic hash functions except bcrypt have unlimited input size).

2 comments

> but they wanted a hash function with unlimited input size

I'm kind of baffled how they came to use bcrypt for this. Bcrypt is not exactly subtle about only supporting 72 bytes of input. And this is at a company who provides auth as a service; I've got to imagine they had multiple engineers who knew this (I guess not working on that code). Hell, I know this and I've only used bcrypt twice and I'm nowhere near a security/crypto guy.

BCrypt should loudly fail if more than 72 bytes are sent to its input.
Maybe it should. Discarding the rest of the bytes works fine for passwords, though. I guess that's just not sufficient.
In my book, discarding entropy is a generally dumb thing to do. Passwords are usually under 72 chars, but a lot of people use concatenations of usernames and passwords in their hash to get guaranteed domain separation between users.
They clearly wanted something stronger than "a hash function" or they'd have reached for weaker cryptographic hashes.
They wanted a hard-to-compute cryptographic hash function. Today, that means bcrypt or something with a KDF construction. However, they needed one with unlimited input size, which rules out bcrypt.
Or just a hash of the bcrypt hash, for the password!

I don't like using thought-stopping cliches any more than anybody else does, but this design feels a little cargo-culted. All this stuff follows the more fundamental question of "why is the password mixed into a cache key"?

Yeah, I think both of the following would have worked if they wanted the password involved in a cache key and they wanted bcrypt to be used:

* bcrypt(SHA-512(PW || stuff))

* SHA(stuff || bcrypt(PW))

Disclaimer: Not cryptography advice.

It's still unclear to me why the password is in there.

> It's still unclear to me why the password is in there.

Perhaps they did not want to apply cache invalidation purely by the passage of time, or want that passage of time to be long, but wanted to treat a credentials update as a cache invalidating event. A safer way to implement that would perhaps be to have a concept of a version of an account, incremented when authentication options or other significant properties change, and including that in the cache key.

I'm not sure why it would matter though: even if a credentials change does invalidate the cache from the PoV of the user looking up information, the information is potentially still in the cache so could still be referred to by someone else who has gained knowledge of the old credentials.

Perhaps the password is used as part of the cache Key so that a password update implicitly invalidates the cache?
hmac-bcrypt solves that problem very well, and should replace plain bcrypt: https://github.com/epixoip/hmac-bcrypt
I don't think it's a good idea for people to adopt new bcrypt constructions so that they can use it to generate cache keys (or, worse, other keys).

(I need that "man standing up in the town hall meeting" meme for this.)

Just use a real KDF, if that's really what you want. I'm still confused what password-derived material is doing in a Redis key.

HMAC-bcrypt is a more complicated version of the first construction I proposed, and it would need a rigorous cryptanalysis if someone wanted to actually use it in production. It sounds like Okta actually wanted PBKDF2(stuff) here.

An authentication company should have known this...

For 'unlimited' input size it should be SHA-3-512. Maybe too slow, but Bcrypt is slower, right? Less things to go wrong too.
All of the SHA functions allow unlimited input size. And yes, bcrypt computation time dwarfs that of SHA-3.

The SHA-3 family has "extendable-output functions," which can ostensibly be used to generate unlimited numbers of bits (albeit with only a given security level). These are new to SHA-3.

bcrypt-pbkdf (used in OpenSSH) exists for that purpose.
Would scrypt solve this problem? Or is it same as bcrypt? Or is scrypt depending on the hardware?