| 1. You want an invertible operation. Invertible operations do _NOT_ lose information, and therefore have the maximum amount of entropy per step. 2. You want the invertible operation to pass a statistical test called "differential cryptography analysis". Over multiple rounds, it must be difficult / impossible to "predict" how 1-bit of difference changes the state. (ie: 1-bit of difference should lead to 50.0000000% change in every output bit. If its 50.1% or 49.9% change of bits, you fail because someone running cryptoanalysis will pick that up). ---------- #1 is somewhat easy. It turns out that Data XOR (Rotate-left Data) XOR (Rotate-right Data) is all you need to make an invertible operation. 5-operations, no more (any more is redundant and therefore unnecessary use of compute power), no less (any less is not invertible and therefore loses information / entropy each step). #2 is complicated: you gotta understand differential cryptoanalysis and run a bunch of tests. ----------- The discovery that (Data) XOR (Rotate-left Data) XOR (Rotate-right Data) was invertible became extremely popular in the 2010s through 2020s, and has become the cornerstone of XOR / Rotate / Add ciphers (aka: chacha), pseudorandom generators, and hash functions. I don't know quite when it was originally discovered, but Jenkins was blogging about the importance of invertibility and playing with invertible xor/rotate stuff in (non-crypto) hash functions way back in the 90s. I know Knuth's "Art of Computer Programming" book 2, Seminumerical Algorithms, discusses the importance of invertibility in random number generators, which is closely related to hashing / cryptographic procedures. So this "understanding" has been around for decades, but has only "become popular" in the SHA256 / Blake3 / pcg-random era. ------ In the 90s, ciphers were mostly using SBoxes for this step ("confusion", to grossly change a value to another value without losing information). But today, modern CPUs are much faster at add/xor/bitshift operations than reading/writing to memory. So SBoxes are no longer a high-speed methodology / primitive for these kinds of operations. It makes more sense to change our algorithms to use a new "invertible confusion operation" (aka: what SBoxes did before, and what ((Data) XOR (Rotate-left Data) XOR (Rotate-right Data)) does today). -------- EDIT: Remember: the modern crypto-primitive is just a combination of "confusion" principles and "diffusion" principles. 1. Confusion "lossless transforms" numbers into other numbers. (A "set permutation" of sorts) 2. Diffusion "moves" bits from one number into other numbers. Iterating over confusion + diffusion many times (IIRC, SHA256 is 64 rounds) is all you need to make a cryptographic cipher. If you "just" need a pseudo-random number generator or hash function, maybe 5 to 10 rounds is all you need. |