Hacker News new | ask | show | jobs
by tptacek 3971 days ago
SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

My current favorite conservative hash choice is SHA512/256, which is the SHA-2 that generates a 512-bit output but truncates it to 256. It gives you the same length extension protection that is the most important feature of SHA-3, and is available in most libraries already.

I have never recommended to anyone that they switch from SHA-2 to SHA-3. I'm actually in "wait and see" mode about SHA-3; there are compelling other hashes available if you want to be ultra-modern about which hash you use.

8 comments

> SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

To extend on that, shortly after SHA-1 fell, there was the very real threat that the SHA-2 family would follow suit (they are conceptionally similar). This worry brought NIST to hold the SHA-3 competition. Fortunately, the SHA-1 attacks did not turn out to be transferrable, so far, and consequently trust in SHA-2 has substantially increased since. Still, NIST (rightly) followed through with the initial idea of the contest and chose a hash function that was as different from SHA-2 as possible (Keccak).

Thus, we have now two very high quality hash functions to our disposal. If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes). This construction is collision resistant if at least one of them remains collision resistant. (Pseudo randomness relies on the security of both hashes, though, and hashing the whole message twice comes at a hefty performance hit. Especially since SHA3-512 is veeery slow – blame it on the clueless tech media attacking NIST for tweaking Keccak, ignoring even the authors who supported NIST's decision.)

> This construction is collision resistant if at least one of them remains collision resistant

Please don't throw around well-defined terms. This isn't true.

What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

On the other hand, collision resistance is a comparison between 2^(hash_length/2) and the work factor required to find a collision. Concatenating the two outputs would only remain collision resistant if it caused an exponential increase in the work factor to find a collision.

Since the SHA-512 output is the whole hash state, once you've found a SHA-512 collision, you can keep appending to the two collided documents and they'll stay collided, so you can use this as a starting point for your SHA3-512 collision. So, even assuming no weaknesses, the work factor to find collisions in your 1024-bit concatenated construction is 2^256 + 2^256, not 2^512, and thus not collision resistant.

Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant. However, as proposed, it's not collision resistant, even if both underlying hash functions are collision resistant.

> Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant.

Actually, as long as the hash functions are iterative, the whole construction can never be significantly stronger than the best hash function, see [1].

> What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

What I meant was "as long as it is infeasible in practice to find a collision in either of them, it is infeasible to find a collision in the concatenation". Comparing the security of hash functions to random oracles with the same output length only makes sense if the construction of the hash function supposedly affords this security.

Conversely, I find it absurd to call the hash function that outputs the first 64 bits of SHA-1 collision resistant, because it requires at least 2^32 steps to find a collision. It fits with the oracle definition, but gives you no information about its real world security.

If you want to make precise statements, you can add the work factor to your statement, e.g. "The first 512 output bits of SHAKE-256 afford preimage resistance up to a work factor of up to 2^256".

[1] Antoine Joux. Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions. In Advances in Cryptology - CRYPTO 2004, volume 3152 of Lecture Notes in Computer Science, pages 306–316. Springer Berlin Heidelberg, 2004. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128...

> If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes).

Although keep in mind that you'll leak information about the input if either hash leaks information about the input.

For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant... but you wouldn't want to use `badhash(pad(secret) ++ nonce)` as a precommitment scheme. All of the extra entropy in the nonce, which otherwise might have protected against brute force attacks on low-entropy secrets, is being given to the attacker via the crc.

> For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant...

Actually, it isn't, for the usual definition of collision resistance compares the work factor to find a collision against 2^(hash_length/2). Extending a hash with crc32 lengthens the hash, but increases the bar for considering the hash collision-resistant. Concatenating the outputs of two collision-resistant hash functions doesn't even (generally) result in a collision-resistant construction under the normal definition of collision resistance.

EDIT: See my nearby post in this same thread for a longer explanation.

Concatenation of the hashes seems like an unjustified risk, that in certain circumstances will allow weaknesses from either algorithm to flow throw to the final hash. If you really want to combine the hashes, XOR seems like a safer bet to me (since the algorithms are unrelated there should be no potential cancellation of entropy).
Seems like XOR is better for approximating a random oracle, and appending is (negligibly) better for ensuring collision resistance. People often are not clear about which of these two very different properties they actually want out of a hash.
++ on SHA-512/256. SHA-512 uses 64-bit operations where SHA-256 uses 32-bit, so on a beefy 64-bit chip, it's faster per byte hashed. So, compared to SHA-256, more rounds, twice the state size, same familiar/widely-implemented design, and faster -- what's not to love?

See http://bench.cr.yp.to/results-hash.html for a comparison of hash-function speeds.

Thanks for the link. I was just about to ask the very ideation you answered. :)
Dang, just saw your reply which was better than mine.
Unfortunately, the SHA cpu extensions that will soon be available in Skylake Xeon parts (and the crypto extensions in ARMv8-a) only support sha2-256 (and SHA256/224, and the ill-advised sha1... why is Intel adding instructions or microcode for a hash function that's being phased out?). So you have a choice between a faster-in-software-on-64bit sha-512/256 and a faster-in-hardware sha-256. Unless there's some way to get partial speed-up of sha-512 using sha-256 instructions, but at a glance they don't look low-level enough to apply to sha-512... do they?

Or you can ignore both sha256 and sha512/256 and use something else like sha-3 or blake2b. Blake2b obviously has less attention on it so more likely to harbor a weakness, but it's fast in software. And sha-3 will get cpu extensions eventually, and it'll hopefully be a better thought out inplementation than just support for the 256-bit variant.

In that case HMAC-SHA-256 may be a good choice. It too is immune to length extension attacks, and the HMAC construct has proven itself to greatly augment the strength of the underlying hashing algorithm (e.g. MD5 is considered broken, but HMAC-MD5 is not). It's just twice as expensive as SHA-256, so I'm not sure if that's faster than SHA-512 on software versus HMAC-SHA-256 on hardware.
512/224 is fine too, and also isn't length extendable.
Skein was built (by Bruce Schneier's team) from the ground up to work to take advantage of multi-core systems to be faster via concurrency.

I'm still quite a fan of it and was sad that it lost. Both are excellent algorithms however.

whaaaaaaat - taking a 512-bit hash and truncating it to 256 preserves everything you need about it! That's crazy (surprising). Naively - if you hadn't just told me otherwise - I'd think it's up there with my brilliant new algorithm: in a loop pad your input with 0x00 through 0xFF, take the SHA512 hash of each result, but only use the first bit! You now have a sooper secure 256 bit hash. I call it SHA512/1/1/.../1 (I'd write it all out here, but it looks obnoxious and might break someone's window width.)
Also, on 64 bit machines SHA-512/256 is generally faster than SHA-256.
I'd love to know what these other compelling hashes are.
Somebody else in this thread was talking about BLAKE2, which I cast a cursory glance at. It seems pretty cool, claims to evade the length-extension 'issues' that SHA-1 has.

Wikipedia indicates that there has been at least some progress as far as cryptanalysis goes, but even with that being said, there's always that lingering 'but what if' about anything NSA-related.

SHA-2 is also length-extendable, which means you have to be careful when you use it to build a MAC. (That's why I like the truncated version).

No cryptographer I know takes these particular "what-if's" seriously. They appear to come exclusively from non-cryptographers reacting to anything that NIST touched.