Hacker News new | ask | show | jobs
by marcosdumay 1063 days ago
> it also creates entropy

It's a nitpick, but it concentrates the entropy. It doesn't create any.

I do think it will make the answer more clear, as the hash concentrates the less than 64 bits of entropy on that 128 bits of original data into a usable 64 bits package.

1 comments

Actually hashes do create entropy (every computation creates entropy in some form or another). What's the entropy of a 4 bit number? What's the entropy of a 4 bit number hashed by a 64 bit hash function? The act of computation does in fact create entropy, as per the 2nd law of thermodynamics, a part of which shows up in the hash.
I don't think you understand what this conversation is about. We are considering information theoretic entropy, not thermodynamic entropy from the mechanism of computation itself.

The result of applying a deterministic function on a random variable cannot have more entropy than the underlying random variable. This is a theorem, one that is trivial enough to not have a name. But you can find solution sets to homework that will prove it for you: https://my.ece.utah.edu/~rchen/courses/homework1_sol_rr.pdf

> every computation creates entropy in some form or another

Ok, what is the entropy created by this function that maps a 4-bit number to a 64 bit number:

    0 -> 0
    1 -> 1
    2 -> 1
    3 -> 1
    4 -> 1
    ...
    15 -> 1
60 bits. Yes, I know, you can compress it down very well. But consider that entropy in computation involves not just the bits you store, but also the bits that the processor touches and eventually dissipates as heat into the universe.
What definition of entropy do you use?

(I'm using Shannon entropy.)

Boltzmann. But it doesn't really matter, it's the same thing. Yes, I know that looking at a sequence of, say 1000 identical bits looks like it's got just 10 bits of entropy after simple RLE compression. But you must not forget the entropy that also generated in the computation itself, and subsequently dissipated into the universe.
It's not the same thing. If I define a function that always returns 1 then the Shannon entropy is extremely low regardless if the Boltzmann entropy of running it on a CPU is high. That the two measures can be different shows they cannot be the same thing. Related in concept, different in definition. In fact, you can even use the same formulas for calculating it - what differs is what your calculating it on.