Hacker News new | ask | show | jobs
by IngoBlechschmid 2801 days ago
"E[foo]" is syntax to mean the expected value of the random variable foo, roughly speaking the mean value. (For instance the expected value of a dice roll is 3.5. The terminology is slightly suboptimal, since we will never expect a dice to come up 3.5.)

Hence the "E" itself is called an "operator". It can be applied to a random value in order to yield its expected value. You can read up on it here: https://en.wikipedia.org/wiki/Expected_value

The definition "E[x] = mu" is correct, though I would write it the other way, as "mu = E[x]", as it's the variable mu which is being defined.

The v's disappear because of a suppresed calculation:

sigma^2 = E[ (v^T x - E[v^T x])^2 ] = E[ (v^T x - E[v^T x]) (v^T x - E[v^T x]) ] = E[ v^T x v^T x - 2 v^T x E[v^T x] + E[v^T x] E[v^T x] ] = E[ v^T x x^T v - 2 v^T x v^T E[x] + v^T E[x] v^T E[x] ] = E[ v^T x x^T v - 2 v^T x E[x]^T v + v^T E[x] E[x]^T v ] = E[ v^T (x x^T - 2 x E[x]^T + E[x] E[x]^T) v ] = v^T E[ x x^T - 2 x E[x]^T + E[x] E[x]^T ] v = v^T E[ (x - E[x]) (x - E[x])^T ] v = v^T E[ (x - mu) (x - mu)^T ] v = v^T Sigma v.

2 comments

Also, the "E[foo]" notation is something you'd pick up in an introductory statistics course. Which, IMO, means it's perfectly appropriate to use it without further explanation in this sort of context.

It's not really reasonable to expect technical subjects like this to always be presented in a way that's easily digestible to people who lack any background in the subject area. This article is clearly aimed at people who are studying machine learning, and anyone who is studying machine learning should already have a good command of basic statistics in linear algebra.

Slightly more formated: http://mathb.in/28658