So if I'm using SHA-1 already to store passwords, what are my options for moving to a different system? I assume there's no way to rehash the passwords?
1. The next time a user logs in to your system and you verify against the SHA-1 hash that they are who they say they are, recompute the correct hash for bcrypt. Then, delete the SHA-1 hash. It does you no good to have a bcrypt version if you keep the SHA-1 version around.
2. Generate the bcrypt hash from the SHA-1 hash. That is, pretend that the SHA-1 hash is the user's password. This isn't as clean (your password authentication software will then have to do SHA-1 followed by bcrypt) but it means you'll be able to migrate your entire database all at once if you so choose. This also causes a very (very, very) slightly higher chance of password collisions, although there's not much to worry about from that.
Or you can do both, migrate everything to bcrypted SHA for now and then replace these with straight up bcrypt the next time the user logs in.
You ostensibly have a hash method identifier per hash, so just create "sha+bc" and "bc" along with your current "sha".
Also, why is the risk of a collision greater? It seems to me that, if anything, it should be lower, as SHA hashes always consist of a fixed number of bits, and thus aren't as likely to collide when hashed to bcrypt, assuming the length of a bcrypt hash is the same (or larger) number than a SHA one.
Basically, it seems to me that, if they're going to collide, they're more likely to collide at the SHA level, which is a problem either way.
Imagine you have two hash functions F and G, both mapping from the domain of integers to integers mod 2^128. Imagine they are perfect in that if you hash all the integers up to some large N, each hash is expected to recorded exactly the same number of times (probabilistically).
Now clearly if I hash a password F(P) and another password F(Q) there is a 1 in 2^128 chance they collide.
Now imagine we do G(F(P)) and G(F(Q)). We first have the chance of 1 in 2^128 that F(P) == F(Q) which implies G(F(P)) == G(F(Q)). However, that is not all!
We now have a new 1 in 2^128 chance that G(F(P)) == G(F(Q)). So we have (about) a 1 in 2^127 chance that the passwords will collide.
But, none of this really matters. Collisions aren't what you worry about with password hashes.
I see what you mean, and I agree that it doesn't matter, but it's an interesting exercise anyway.
I disagree that there's a 2^128 chance that they will collide. Trivially, I can show you a hash that will never collide for up to some N, and that is F(P) = P mod 2^128. This will never collide unless P is more than 128 bits long.
My rationale, above, was that SHA constrains the space to 128 bits. Therefore, for different SHA hashes (ones that haven't already collided), the probability that bcrypt will collide might be smaller (or zero, as in my example above).
In reality it doesn't work like that, I know, but in theory you can't be sure that the probabilities of collision will add (or, well, multiply) up.
Point taken, it's probably true that the probability that G(F(P)) == G(F(Q)) given F(P) != F(Q) is less than 1 in 2^128. But it's probably also true that it's greater than 0.
Clearly it's impossible to be less than zero. So no matter what you do, Defining H(X) to be G(F(X)) will have strictly more collisions than F(X).
The reason I would argue it's greater than zero is that if a function H existed such that H(X) will never collide for X less than 2^128, it would probably have some cryptographic weakness.
Hmm, you're right indeed, since they're additive. I should have said that the second layer is less likely to collide than the first, not than both combined.
I've had the "joy" of doing password migrations in the past when moving authentication systems. We made our auth chain support both methods and then when someone logged in with an old method it stored the password in the new method and deleted the old record. This was on a system with only a few hundred users though, so YMMV.
Convert as sessions expire and people sign in again. You'll obviously need a db column to keep track of which passwords are converted/unconverted, e.g. password_algo.
1. The next time a user logs in to your system and you verify against the SHA-1 hash that they are who they say they are, recompute the correct hash for bcrypt. Then, delete the SHA-1 hash. It does you no good to have a bcrypt version if you keep the SHA-1 version around.
2. Generate the bcrypt hash from the SHA-1 hash. That is, pretend that the SHA-1 hash is the user's password. This isn't as clean (your password authentication software will then have to do SHA-1 followed by bcrypt) but it means you'll be able to migrate your entire database all at once if you so choose. This also causes a very (very, very) slightly higher chance of password collisions, although there's not much to worry about from that.