They wanted a hard-to-compute cryptographic hash function. Today, that means bcrypt or something with a KDF construction. However, they needed one with unlimited input size, which rules out bcrypt.
Or just a hash of the bcrypt hash, for the password!
I don't like using thought-stopping cliches any more than anybody else does, but this design feels a little cargo-culted. All this stuff follows the more fundamental question of "why is the password mixed into a cache key"?
> It's still unclear to me why the password is in there.
Perhaps they did not want to apply cache invalidation purely by the passage of time, or want that passage of time to be long, but wanted to treat a credentials update as a cache invalidating event. A safer way to implement that would perhaps be to have a concept of a version of an account, incremented when authentication options or other significant properties change, and including that in the cache key.
I'm not sure why it would matter though: even if a credentials change does invalidate the cache from the PoV of the user looking up information, the information is potentially still in the cache so could still be referred to by someone else who has gained knowledge of the old credentials.
Then use some other data which can act as a proxy for that, like the date of the last credential change. Using the password itself is a terrible security smell.
HMAC-bcrypt is a more complicated version of the first construction I proposed, and it would need a rigorous cryptanalysis if someone wanted to actually use it in production. It sounds like Okta actually wanted PBKDF2(stuff) here.
An authentication company should have known this...
I feel like the "authentication company should have known" thing is unuseful; most developers at "security" companies are just ordinary generalist developers. Ironically, I think they boned themselves by trying to be too clever here, not too casual.
You don't think a company whose entire reason for being is providing security services for other companies should have designs related to authentication reviewed by security experts?
All of the SHA functions allow unlimited input size. And yes, bcrypt computation time dwarfs that of SHA-3.
The SHA-3 family has "extendable-output functions," which can ostensibly be used to generate unlimited numbers of bits (albeit with only a given security level). These are new to SHA-3.
SHA-3 has more internal state, it really is plausibly better at handling very large data. If 'unlimited' is really less than a gigabyte, there's no problem. It's mostly the preimage series of attacks and length extension at that point. SHA-3 is better on those. SHA-512 has zero length extension attack resistance.
Internal state length may be a bit of a red herring (note that SHA-3 makes up for that longer internal state by ingesting more data per round), but SHA-3 probably has a higher security margin than the SHA-2 construction mostly because we have had sponge constructions for less time than we have had Merkle-Damgard constructions. NIST basically forced a higher security margin on SHA-3. You are correct about the length extension attacks (although these are mitigated by using SHA-2-512/256 for example), but I don't think that matters here.
I don't like using thought-stopping cliches any more than anybody else does, but this design feels a little cargo-culted. All this stuff follows the more fundamental question of "why is the password mixed into a cache key"?