Hacker News new | ask | show | jobs
by pwg 3184 days ago
Significantly better than using MD5 or storing in plaintext in 2012 (both of which would have been likely in 2012).

And in 2012 the current breaks from this year were not yet known. Some considered sha1 to be in its twilight, but it was not 'broken' yet at that time.

2 comments

It is in fact not significantly better, for this purpose, than MD5.
That assertion is much easier to make now, with the knowledge we have in 2017, five years later.

But without knowledge of what was coming for sha1 in five years, back in 2012 it would have been a much better choice than either MD5 or plaintext storage.

However, even today, with the knowledge we now have regarding sha1, if ones choices are limited (for some strange reason) to only sha1 or MD5, sha1 is still a better choice than MD5. Yes, sha1 is weak, and it should clearly not be used for any new designs, but sha1 is still stronger than MD5.

Also note, the 2012 date was when they last used sha1, not when they started using it. That fact is somewhat critical to keep in mind. They last used sha1 in 2012. What got leaked were some leftover hashed passwords that never got updated to bcrypt that were still hanging around in their database (probably because those accounts have never logged in for the last five years and been forced through a password change).

No. For similar reasons, salted SHA-2 is also not materially better than MD5. You think this is about the strength of the underlying cryptographic hash, but that has in fact very little bearing on the strength of the password hash construction.
Clearly there is some critical piece of knowledge that I'm lacking, so please help me understand where my misunderstanding lies.

The article announcing the breach contains the term "SHA1" in exactly two places: "passwords (hashed using SHA1 with a salt;" and "password hashing algorithm from SHA1 to bcrypt".

Absent evidence to the contrary (of which the article provides no such evidence), I am reading "hashed using SHA1 with a salt" to mean they used this construction:

    Hp = H(S||P) or
    Hp = H(P||S)
    where:
    S is a salt (derivation method unstated)
    P is the plaintext password
    || is byte concatenation
    H( ) is a hash function (sha1 in this specific case)
         applied only once to the input bytes
    Hp is the "hashed salted password"
How does the strength of the construction H(S||P) (or H(P||S)) not have a direct bearing on the strength of the chosen hash? It is nothing but the chosen hash. What am I misunderstanding here?
Forget about the strength of the underlying hash. That's not how you recover passwords from hashed password databases. In reality, the way you recover passwords is to take a dictionary starting at AARDVARK and work your way to ZEBRA and every alphanumeric string in between, hashing each one and comparing it to the target password. Because MD5, SHA1, SHA2, Blake, Blake2, and SHA3 are all designed to be as fast as possible, this attack is extremely effective, and can be accelerated dramatically with GPUs.

The "password hashes" PBKDF2, bcrypt, scrypt, and Argon2 are all designed, the same way a KDF is designed, to mitigate this attack. All of them have a "work factor" that requires you to iterate the underlying hashing primitive (which might very well be SHA2) many times before arriving at the answer.

SHA1 and SHA2 aren't password hashes. That's what people here keep trying to explain. None of the well-understood flaws in MD5 and SHA1 are really relevant to the password hash setting. They're a disaster for cryptographic signature constructions, but they do not matter at all for passwords.

Sha1 hasn't been the recommended best practice for a very long time. (Really ever.) Bcrypt dates back to 1999. Even if you give it 10 years for evaluation it would have to be considered in 2009. And indeed it was recommended in 2007, 5 years before this breach. RFC2898 (PBKDF2) came out in 2000, 12 years before this breach. Scrypt was released in 2009, so I could understand not adopting it by 2012 out of concern for insufficient vetting. Sha1 would only have been acceptable between 1995 (its release) and 2000 or so. Though even then the practice of key stretching was known: IIRC /etc/shadow has done that since the beginning, running 1000 iterations of MD5 by default. Looking it up that was released in 1987. 25 years!
http://valerieaurora.org/hash.html

that's BS to think sha1 was the best hash you could pick in 2012

This is a chart of general-purpose hashes, not password hash constructions. All the hashes on Valerie's chart are bad password hashes.
I said nothing at all about sha1 being "the best ... you could pick". You read that in from somewhere.

I said it (sha1) was significantly better than MD5 or plaintext. That neither says nor implies that sha1 is best, just that it was better than other options that some might have chosen in 2012.

And that is false, sorry to say. Plainly false. The weaknesses unique to MD5 (in 2012) and SHA1 (in 2016) don't matter for password hash constructions. The weaknesses shared by salted MD5, SHA1, SHA2, and SHA3 --- each a distinct construction from the underlying hash --- matter hugely for password storage.

The problem is that MD5, SHA1, SHA2, and SHA3 are not password hashes. The password hash constructions in common use are PBKDF2, bcrypt, scrypt, and Argon2. Some of them use SHA2 as a primitive, some of them don't, but none of them work by simply concatenating a salt with a password and hashing.

It doesn't matter if it's a "password hash" if it's a cryptographically secure hash and a long enough password. If it can withstand all the attacks that give you shortcuts to finding out what the input was, given the output, it's fine.

Password hashes only help protect against brute force searches by increasing the cost to attack linearly with the cost to verify. But that isn't a great tradeoff and isn't future-proof.

All the crypto engineering that goes into password hashes is about the fact that passwords aren't long enough, so your "if" caveat makes your argument rather disconnected from the real world. People won't use passwords with the sufficient amount of entropy, they couldn't even if they wanted to (because of memorizing difficulties, typos, lack of good text entry UI on mobile devices, etc).

As long as you're using a password entry field designed for manual entry, you can't credibly counter that with "people should use password managers and autogenerated long line-noise passwords". Because you can't base your security upon all your users taking the initiative and doing the power-user non-default thing.

It's also true that even with a "password hash" your short password is not secure. It makes the attack more expensive, maybe from $10 to $10000 today, or $1000 next year. But practically that isn't something you should rely on.