Hacker News new | ask | show | jobs
by datenwolf 1065 days ago
Actually hashes do create entropy (every computation creates entropy in some form or another). What's the entropy of a 4 bit number? What's the entropy of a 4 bit number hashed by a 64 bit hash function? The act of computation does in fact create entropy, as per the 2nd law of thermodynamics, a part of which shows up in the hash.
2 comments

I don't think you understand what this conversation is about. We are considering information theoretic entropy, not thermodynamic entropy from the mechanism of computation itself.

The result of applying a deterministic function on a random variable cannot have more entropy than the underlying random variable. This is a theorem, one that is trivial enough to not have a name. But you can find solution sets to homework that will prove it for you: https://my.ece.utah.edu/~rchen/courses/homework1_sol_rr.pdf

> every computation creates entropy in some form or another

Ok, what is the entropy created by this function that maps a 4-bit number to a 64 bit number:

    0 -> 0
    1 -> 1
    2 -> 1
    3 -> 1
    4 -> 1
    ...
    15 -> 1
60 bits. Yes, I know, you can compress it down very well. But consider that entropy in computation involves not just the bits you store, but also the bits that the processor touches and eventually dissipates as heat into the universe.
What definition of entropy do you use?

(I'm using Shannon entropy.)

Boltzmann. But it doesn't really matter, it's the same thing. Yes, I know that looking at a sequence of, say 1000 identical bits looks like it's got just 10 bits of entropy after simple RLE compression. But you must not forget the entropy that also generated in the computation itself, and subsequently dissipated into the universe.
It's not the same thing. If I define a function that always returns 1 then the Shannon entropy is extremely low regardless if the Boltzmann entropy of running it on a CPU is high. That the two measures can be different shows they cannot be the same thing. Related in concept, different in definition. In fact, you can even use the same formulas for calculating it - what differs is what your calculating it on.
> If I define a function that always returns 1…

then it's Kolmogorov complexity is also extremely low.

Look if you have a well enough hash function, it output should be near the Shannon limit and hardly compressible, and ideally contain as much entropy as it has bits. But you can feed in just a single bit or the entire knowledge of humanity, in the end you're going to get a fixed amount of bits, and entropy near of that, and if you throw any form of lossless compression at it, it will hardly compress.

But quantum mechanics tells us, that information cannot be destroyed. So when you feed it more bits, than it emits, then its mostly the entropy of the information you feed in, that you get out of the hash. But if you feed it just a single bit, the additional entropy comes from the computational process.

I know, this is now getting really philosophical, but here's something to ponder on: How would you implement a hash function for a reversible computing architecture?