Hacker News new | ask | show | jobs
by aj-code 3639 days ago
The password hashing algorithm outlined is really not best practise, only 7 rounds and the use of SHA-256 which is a general cryptographic hash, not a password specific one.

The correct answer to password hashing is still mostly "just use bcrypt", or even better, scrypt or argon2. Failing that PBKDF2 with enough rounds is not terrible, if you have to. All of these algorithms should to be tuned to require as much processing power as you can afford.

I'm also a fan of storing another random value somewhere separate from the salt/password hash, which is incorporated into the hashing scheme. You might store the salt/password hash in the database, and the other random value on the application servers. This means if your database is compromised (i.e. though SQL injection), hashes can't be cracked without also compromising an application server.

1 comments

I've had a lengthy debate and discussion about this with a number of cryptographers. Here's the math we came up with that appears to allow for SHA and other hashes (blog post will be coming eventually). Please correct me if anything is wrong:

----------------

High end GPU hardware can do about 1T (1e12) hashes per second. This is top end stuff that is expensive, but available. Brute force attacks usually look something like this with no knowledge of anything:

a

aa

aaa

aaaa

aaaaa

aaaaaa

etc.

We can probably reduce the set by skipping passwords of length less than 6 or even 8. In fact, we can determine the password rules by simply signing up for the system in question and more often then not they will tell you what the password rules are. It’s likely that passwords are between 8 and 16 characters, can’t be dictionary words, case-sensitive, etc.

We now have to determine if the hacker has our user data via access to the server or via some type of SQL injection hack that gave the data to them via the web. Let’s assume the latter because our servers are using RSA keys, intrusion detection, and a bunch of other crap to prevent access. This means the hacker has no idea how many hash iterations we are doing or if we are doing in memory salt modification. They likely have the salt though, because that’s probably stored in the database.

Therefore, the hacker has no choice but to assume the salt is either prepended or append to the password with no modification and the application is doing 1 or more hash iterations.

In order to try all possible combinations of 8-16 characters with 100 character possibilities (a-zA-Z0-9 + punctuation characters rounded up for simplicity) would require this many hashes:

100^8 + 100^9 + 100^10 + 100^11 + 100^12 + 100^13 + etc.

This rounds out to be roughly 1e32. If the GPU can do 1e12 per second, it would require 1e20 seconds. That’s a horrifically long time.

What this does mean though is that the longer passwords would never be cracked and the shorter ones would certainly be cracked. If you do all the 8 character combinations first, it would require 1e16 hashes. That means the GPU could complete that in 1e4 seconds. i.e. 3 hours or so. If you add in multiple hash iterations, it simply multiplies the hours. You could throw a few hundred GPUs at it and keep the rate to 3 hours, but try 1-300 hash iterations in parallel.

The safety in any process is modifying the algorithm in code. If you modify the salt by even 4 characters, it immediately makes the brute force approach impossible as the combinations grow exponentially based on the length of the salt. You can use two secure UUIDs as the salt and swap 4 characters based on an algorithm like this:

salt[salt[0] % salt.length] = salt[salt[4] % salt.length] salt[salt[1] % salt.length] = salt[salt[3] % salt.length] salt[salt[2] % salt.length] = salt[salt[2] % salt.length] salt[salt[3] % salt.length] = salt[salt[1] % salt.length]

We didn't calculated the exact iteration count, but it appears like the number of hashes required would be on the order of 1e77 plus the base set of hashes for the password portion of the String. We could be way off base, but this seems like it causes brute force attacks on virtually any hash algorithm computationally impossible.

----------------

Based on this, we assumed that SHA hashes are computationally impossible to brute force given some amount of salting and modification. BCrypt is great because it reduces the initial number of hashes per second considerably (from 1e12 to 1e6 or less). However, BCrypt is not required in order to secure a system, but it certainly works nicely.

This is a good overview of how hackers actually crack hashes if anyone is interested. https://www.trustedsec.com/june-2016/introduction-gpu-passwo...

The other things to take into account:

Code complexity, quality, and cost of ownership. Rolling your own scheme when mature and well tested code already exists is generally something to be avoided, especially when crypto is involved.

Force multipliers. Your server is probably using a CPU to calculate hashes, while hackers are using at least GPUs and possibly FPGAs or other hardware, which are hugely faster than your CPU. So yes, while you can use any cryptographic hash with enough rounds to make high quality passwords unlikely to be cracked, using a password specific algorithm correctly will greatly reduce the advantage hackers have. This makes more of your user's passwords unlikely to be cracked.

The only advice any respectable cryptographer will have given is "don't invent your own crypto".

I'd urge you to reconsider having done so.

The problem with "please correct me if anything is wrong" is no one is going to invest significant man-hours reviewing yet another algorithm unless it comes with a detailed paper showing why it's better than the existing standard. Which it isn't, because there is no memory complexity here, and thus, Argon2 is a more interesting target.

> ... all possible combinations of 8-16 characters with 100 character possibilities ...

Yes, but this doesn't even come close to describing the typical users' password, which is most likely a 6-letter English word with a capital letter and a 1! appended to the end. Your calculation here isn't really relevant, because it's all about the worst or common case. (You also assume that people are using a GPU for a compute-bound problem, when much faster FPGAs are also available, but either way it's moot.)

Security through obscurity, which is what you're proposing with the shuffled salt idea, is also not normally considered the right way to go. If you wanted to use a similar but much simpler and straightforward method, you could just encrypt the salted hashes before storing them in the database.

Simple passwords are mitigated by salting and slat modification. The guide covers both of these and hashing. This should be sufficient even against thousands of GPUs.
In short, it's not. A robust scheme should be secure if the attacker knows the algorithm or not. The secret should not be the algorithm, the secret should be a sufficiently long secret value.