|
You may not get an exact answer for SHA-1 or SHA-2, because they were designed by NSA, and I'm not sure whether a complete design doc has been published. SHA-3, by contrast, was designed in an open competition, and is completely different from SHA-1 and SHA-2. But here's a rough go at it. First of all, the overall form. SHA-2 is a Merkle-Damgård construction, meaning that it has a "compression function" which takes an algorithm state and a message block, and outputs a new algorithm state. The final block gets some padding, so that eg "x" and "x\000" don't have the same hash, and then the final algorithm state is the output. (This is a known weakness: from H(x) you can calculate H(x concat padding(x) concat other stuff). SHA-3 doesn't have this weakness. This weakness doesn't itself break collision resistance, but it can be dangerous for other uses of SHA2 if you aren't careful.) The compression function is a Davies-Meyer construction, which means that given an input state, it: * Copies the state. * Encrypts the state with a "cipher" of sorts (the "cipher" extracted from SHA-2 is known as SHACAL-2), using the message as a "key". * Adds the copied state to the encrypted state to produce a new state. Davies and Meyer proved that, for an ideal cipher, this would make a collision-resistant hash function. Note that because the update is a "cipher", it must be internally invertible. This is a good idea anyway, because any non-invertibility is by definition a collision, and you want to push that out as far as possible to make collisions hard to find. The step that isn't invertible is the final "add the copied state to the encrypted state". Within that design, the SHA-2 cipher is based on a shift register. Basic shift registers have been used in crypto for many decades; traditionally you get almost all of the bits in the register by shifting its current state, and at least one (typically only the one shifting in) by computing a function of the other bits. To extend this to general-purpose computers, you can do the same thing for words (32-bit words for SHA-224 and -256 and 64-bit words for SHA-384 -512). So you have A shifting to B shifting to C etc, with a final value that's computed and shifted back into A. You can see there's a slight twist, which is that instead of E=D you get E=D+stuff. I expect that this mitigates issues where the same stuff is used in the same way throughout the whole round. The "message schedule" is likewise based on a cipher's key schedule. The idea in a cipher is that each round needs a different version of the key to avoid "slide" attacks, so you mix the key during the round as well as mixing the cipher state. I'm not sure what the corresponding theory is for a hash function, but it's also pretty clear that you want a single byte of the message to be used in many places throughout the operation, so mixing the message is a must. For the operations, the biggest constraint is that they need to be as nonlinear as cheaply, because doing a bunch of linear functions in a row still gets you a linear function, and those are easy to break. They need to be nonlinear both as bits and as integers, so a popular design is to mix addition (linear over ints) with xor (linear over bits) and rotations (linear over both, but you need it to propagate changes from the top of a number to the bottom). That way the whole operation won't be linear over bits, nor over the integers, but all three operations are cheap on PCs. This is called an "ARX" (Add-Rotate-Xor) design. Picking the exact operations and rotation constants is something of a black art, and while you can trace descent from MD5-SHA0-SHA1-SHA2, the exact design criteria for the SHAs might not even be public. But you can see how some of the decisions may have been made. Consider the rotations in sigma0 and sigma1. The rotation constants in these designs are typically chosen so that if you look at any slice of the output, it will have many different input bits affecting it, which in turn means that small changes diffuse rapidly thoughout the design. This is not done perfectly in SHA2, in that the same bits are used for the Maj and Ch in successive iterations (i.e. you compute Maj(A,B,C) and then Maj(newA,A,B) next time since they shifted over), so you'd get correlated outputs of those steps from round to round, but apparently it's good enough. The setup constants are chosen as the square and cube roots of prime numbers, as "nothing-up-my-sleeve" numbers. The idea is that if the initialization were random numbers with no rationale given, then it would be a possible place that NSA could have hidden a backdoor. But it would be difficult to create a backdoor where the initialization is the square roots of the primes. |
For example: if "add Foo" were our primitive, then 64-rounds of "add-Foo" could be broken by simply "data + 64 * Foo". IE: We found a "trivial shortcut" and someone else can break our cipher way faster than we can use it.
Addition is linear over numbers, because of multiplication.
XOR is linear over numbers: because XOR undoes itself. (5 x (XOR-Foo) is equivalent to 1x (XOR-Foo)).
It turns out that combining addition and XOR together results in a non-linear function over numbers and non-linear function over bits.
We can't "shortcut numbers", because the XOR-messes up our "number math". We can't "shortcut bits" because Add has this really annoying "carry" that basically has a 50% chance of shuffling bits around.
As such, we "expect" that combinations of Addition, XOR, and Shifts are non-linear. I'm not sure if it's been mathematically proven however, but no one has been able to show a "shortcut" thus far.