Hacker News new | ask | show | jobs
by larssorenson 2118 days ago
Compromising the hash and salt, since they must be stored close together, makes it possible to identify if the salted hash is a password in a corpus of previously compromised passwords. An attacker can do Hash(PW, Salt) for all PW in a list of leaked/cracked plaintext passwords. If they've guessed your password and it's shared across multiple services, lateral compromise. Salting only prevents the rainbow table attacks, where an attacker precomputes all possible hash values for a known keyspace (like, say, 8 alphanumeric length passwords) and just look up for a match. Encryption is concerning because it necessitates the ability to decrypt since they're often inverse operations of each other, and presumably there's a shared key stored somewhere to do the comparison, which means it's likely trivial to recover the password compared to hash cracking and undermines any strength or complexity benefits. This also likely points to other bad behaviors utilizing this "feature", such as helpfully emailing you your plaintext password when you forget it.
2 comments

Rainbow Tables are a specific innovation in time-space tradeoffs (precomputation) rather than the name for all such attacks.

The specific clever trick in Rainbow Tables is the observation that rather than storing

  hash(password) : password
  5f4dcc : password
  c2fe67 : jimmy
  25d55a : 12345678
... we can build a function that takes the output from hash(password) to deterministically create a new candidate password, let's call this function pass(hash), and then chain the hash and our new function together as many times as we want. This lets us store much less data, while doing more work during our look-up phase.

  hash(pass(hash(password))) : password
  153dfc : password
  92fe87 : jimmy
  213eea : 12345678
Now if I find a hash 92fe87 in a password hash file, I do not learn that the password was jimmy, instead I need to compute pass(hash(jimmy)) and that's the password I was looking for. And if I find 39a4e6 which isn't in my list, I calculate hash(pass(39a4e6)) and discover that's 213eea, then I look this up in the table and I discover the password I need was 12345678. Obviously real Rainbow Tables don't just run the hash twice like this, but instead some fixed number of times chosen by the creator to trade off less space versus more work to find a password.
I should actually fix this. What I've described above is basic "chaining", but Rainbow Tables are a further improvement still by Philippe Oechslin. The additional insight in Rainbow Tables is that we can reduce collisions in our hash-pass-hash-pass back and forth if we modify that pass function so that its behaviour varies by depth, this way if a collision occurs but at different depths in different chains (e.g. maybe the chain starting with password "password" hashes immediately to 5f4dcc but in another chain the value 5f4dcc is found for the password "j58X_m04" after six steps) the next call to pass() will diverge again, so the collision only wastes a small fraction of our precomputation effort. If the collision does happen at the same place in the chain, the final hash output will be identical to another chain, so it's easy to discover this problem and apply whichever mitigation seems appropriate.
Interesting, I haven't worked with rainbow tables very much since by the time I got into the world of hash cracking it had either been deprecated by salting or wasn't relevant (i.e. NTLM). That is a clever trick of trading back some of the space for extra time; I remember some of the rainbow table file sizes being ridiculous to the point of almost unusable haha.

edit: spelling

If one uses bcrypt for hashing passwords, as currently the best practice recommends, building basically a salted rainbow table becomes rather expensive, too. Not impossible, since the amortized cost for many common passwords is relatively low, but still sort of expensive.

Ideally a machine that generates and checks the hashes should be a box without a NIC, connected to the rest of the servers via a bunch of RS-232 ports. This would make extracting the salt much harder, down to effectively impossible. Few orgs can afford such a setup, though, due to the hassle of administering it.

> Not impossible, since the amortized cost for many common passwords is relatively low, but still sort of expensive.

This statement seems like it gravely underplays the numbers.

Traditional Unix crypt uses a 12-bit salt. So this means your precomputation (whether a Rainbow Table or not) is 4096 times more expensive. That's just about plausible though already uncomfortable ("Sorry boss, I know you said the budget was $10 but I actually spent forty thousand dollars").

But bcrypt uses a 128-bit salt. So now your precomputation is so much more expensive that if the equivalent ordinary brute force attack on a single password cost 1ยข and took one second on one machine, you'd spend a billion dollars per second, over a billion seconds, on each of a billion machines, and still not even have scratched the surface of the extra work you've incurred to do your precomputation.

Or to put it another way: Impossible in practice.

I see! But the case I'm arguing about is when the salt is known, pilfered along the password hashes during a compromise.
So what are you precomputing?

Rainbow Tables are a precomputation "time-space tradeoff" attack. You do a bunch of preparatory work which is amortized over multiple attacks and results in needing space to store all your pre-computed data. This is nice for two reasons:

1. You get to do all the hard work before your attack, leaving less time between the attack and your successful acquisition of the passwords compared to work that's necessarily done after stealing the credential database.

2. You can re-use this work in other attacks

But if you're waiting until you know the salt you don't get either of these advantages, so Rainbow Tables are irrelevant.

It's like if somebody mentions the F-14 fighter jet in a discussion about the fastest way to get from Times Square to Trump Tower. Yes the F-14 fighter jet is a fast aeroplane, but it can't go to either of those places so it isn't relevant whereas Usain Bolt is a very fast human so he really could run from one to the other.