|
|
|
|
|
by nerdponx
859 days ago
|
|
It's better than "it works well in practice". The question is misguided as stated. It's like asking why chemists care about density for measuring mass. If you are looking at the likelihood of any particular outcome of a continuous random variable, then you do not understand how probability works. The probability of any particular real number arising from a probability distribution on the real numbers is exactly 0. It's not an arbitrarily small epsilon greater than zero, it's actually zero. This definition is in fact required for probability to sense mathematically. You might ask questions like why does maximum likelihood work as an optimization criterion, but that's very different from asking why we care about likelihood at all. The comments on the original question do a good job of cutting through this confusion. |
|
I can try to make an explanation from the bayesian framework(but as I mentioned it's not the only relevant one)
Likelihood is P(measurement=measurement'|parameter=parameter'). This is a small value. Given a prior we can P(parameter=parameter'|measurement=measurement'). This is also small. But when we compute P(parameter'-k<parameter<parameter'+k|measurement=measurement') then all the smallness cancels see the formulation of bayes that reads
P(X_i|Y) = (P(X_i)P(Y|X_i)/(sum_j P(X_j)P(Y|X_j))
I'm obviously skipping a lot of steps here because I'm sketching an explanation rather than giving one.