| The primitive (data) XOR (rotate-data) XOR (rotate-data) is invertible, which means there's no "bit funnel" (Bob Jenkin's term). You want your "primitives" to be invertible, so that your one source of "non-invertible" operations is controlled very carefully. Hash functions are non-invertible. But all operations on the "internal state" should be invertible (!!!!) to maximize the entropy per round. -------- All good hash-functions have invertable operations (aka: a bijective (1-to-1 and onto) operation) over its state to "distribute" the bits around in some manner. You then perform a non-reversable XOR over the input message data (this is the one and only non-reversible step), maybe after operating over the input data somehow. Whenever you're "mixing" bits around in a cipher or hash, you need every step to be invertible to maximize your entropy. If you have 2^256 possible input states, you want to have 2^256 output states. By the pigeon-hole principle, this only happens if you have a bijective / 1-to-1 and onto invertible operation. -------- Lets look at the opposite. Lets say we have a non-invertible function. That is, 2^256 possible inputs become only 2^128 outputs. Guess what? We've just lose 128-bits of information (!!!). It doesn't matter how complex your operation is. If you have 256-bits of input and only 128-bits of output, you've "lost information". You _NEVER_ want to lose internal-state information. The hash-function's internal state should be at a maximum 256-bits (with 256-bits of entropy) every step. Sure, the input might be 512-bits, 4096-bits, or 100-MBs long, but each "step" of a 256-bit hash should retain 256-bits of state to be maximally "difficult" to reverse. That's kinda-sorta obvious if you think of it from a pigeon-hole perspective. |
If your round function isn't invertible, then it's going to converge at some point on some fixed value, and the round function is going to stop doing anything useful.
More broadly, SHA2 is a sort of instance of a construction called Davies-Meyer, which treats each message block as the key to a bona-fide block cipher (SHACAL, in SHA's case), each encrypting the previous block. It's hopefully pretty obvious why a block cipher core needs to be invertible. :)
So I also find it kind of helpful to remember that you can take any block cipher and turn it into a hash, and then a "good" hash function is just optimizing the block cipher core around the goals of a hash function (be fast, be nonlinear, be hard to find differential trails through the whole sequence of round functions, &c).
It's a thing I don't love about illustrations like the one we're discussing, in that it sort of presents this "intelligent design" problem of, like, "how did we arrive at this incredibly complex fully functioning eyeball", when there's a whole series of smaller evolutionary steps that led to this point; it's more productive maybe to look at the bones of the whole design before diving into the details of which bits go where at each precise step.