What differentiates a non-crypto hash from a crypto hash. Is there a fundamental difference between the two which prevents one from being used for cryptographic purposes?
A non-crypto hash is run enough to confuse "simple" statistical tests.
A crypto-hash is run enough times to confuse "difficult" statistical tests from a dedicated adversary.
For example, the JOAAT hash is a simple add / shift / XOR hash run 1x round per byte. A singular round is all you need to confuse the "simple" statistical tests such as birthday attacks, dice rolls, etc. etc.
SHA-256 is also a simple add / shift / XOR hash, albeit using 256-bits at a time instead of 8-bits at a time. It is a 64-round hash function. That is: it is round(round(round(round(data)))) 64-times deep.
--------
If you wanted to turn JOAAT non-crypto hash into a crypto-hash, a good first step would be to run JOAAT(JOAAT(JOAAT...() maybe 64x or 128x over the data.
Obviously, you'd need to run cryptoanalysis over the end result, and also test for non-linearity and other such properties. But "large number of rounds" is probably the most obvious difference between non-crypto and crypto hashes.
---------
In the 90s, functions like AES were designed with "maximum confusion / diffusion per round". In just 10 rounds, the bits are mixed up enough that it defeats linear and differential cryptography.
I don't know which cipher "changed the meta", but maybe it was the XTEA cipher? Since XTEA was so simple, they knew it would need many-many rounds to defeat linear and differential crypto. So they just did 64-rounds and called it the end of it.
----------
So you can see, a "crypto" hash is effectively a regular old hash done an absurd number of times to defeat the most incredible statistical-tests known to man... done under the assumption that the "opponent" is performing the most difficult statistical tests.
Yes! We’d like our cryptographic hash functions to be collision resistant and preimage resistant.
That is, we’d like it to be hard to generate 2 different messages m1 and m2 where the hash of m1 is equal to the hash of m2, and we’d also like for it to be hard to compute any function of the message m(except the hash of m) if you’re given only the hash of m.
Non cryptographic hash functions don’t require these properties, and in fact some hashing algorithms used for data mining are designed to, for example, map near inputs to near outputs.
A crypto-hash is run enough times to confuse "difficult" statistical tests from a dedicated adversary.
For example, the JOAAT hash is a simple add / shift / XOR hash run 1x round per byte. A singular round is all you need to confuse the "simple" statistical tests such as birthday attacks, dice rolls, etc. etc.
SHA-256 is also a simple add / shift / XOR hash, albeit using 256-bits at a time instead of 8-bits at a time. It is a 64-round hash function. That is: it is round(round(round(round(data)))) 64-times deep.
--------
If you wanted to turn JOAAT non-crypto hash into a crypto-hash, a good first step would be to run JOAAT(JOAAT(JOAAT...() maybe 64x or 128x over the data.
Obviously, you'd need to run cryptoanalysis over the end result, and also test for non-linearity and other such properties. But "large number of rounds" is probably the most obvious difference between non-crypto and crypto hashes.
---------
In the 90s, functions like AES were designed with "maximum confusion / diffusion per round". In just 10 rounds, the bits are mixed up enough that it defeats linear and differential cryptography.
I don't know which cipher "changed the meta", but maybe it was the XTEA cipher? Since XTEA was so simple, they knew it would need many-many rounds to defeat linear and differential crypto. So they just did 64-rounds and called it the end of it.
----------
So you can see, a "crypto" hash is effectively a regular old hash done an absurd number of times to defeat the most incredible statistical-tests known to man... done under the assumption that the "opponent" is performing the most difficult statistical tests.