|
|
|
|
|
by dkislyuk
433 days ago
|
|
This is a great characterization of self-information. I would add that the `log` term doesn't just conveniently appear to satisfy the additivity axiom, but instead is the exact historical reason why it was invented in the first place. As in, the log function was specifically defined to find a family of functions that satisfied f(xy) = f(x) + f(y). So, self-information is uniquely defined by (1) assuming that information is a function transform of probability, (2) that no information is transmitted for an event that certainly happens (i.e. f(1) = 0), and (3) independent information is additive. h(x) = -log p(x) is the only set of functions that satisfies all of these properties. |
|