| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nerdponx 859 days ago

It's better than "it works well in practice".

The question is misguided as stated. It's like asking why chemists care about density for measuring mass.

If you are looking at the likelihood of any particular outcome of a continuous random variable, then you do not understand how probability works.

The probability of any particular real number arising from a probability distribution on the real numbers is exactly 0. It's not an arbitrarily small epsilon greater than zero, it's actually zero. This definition is in fact required for probability to sense mathematically.

You might ask questions like why does maximum likelihood work as an optimization criterion, but that's very different from asking why we care about likelihood at all.

The comments on the original question do a good job of cutting through this confusion.

2 comments

kjhcvkek77 859 days ago

I appreciate your response but I don't really agree. They say that likelihood can be multiplied by any scale factor or that it's only the comparative difference that matters, or we can make a little plot, but they don't actually explain why.

I can try to make an explanation from the bayesian framework(but as I mentioned it's not the only relevant one)

Likelihood is P(measurement=measurement'|parameter=parameter'). This is a small value. Given a prior we can P(parameter=parameter'|measurement=measurement'). This is also small. But when we compute P(parameter'-k<parameter<parameter'+k|measurement=measurement') then all the smallness cancels see the formulation of bayes that reads

P(X_i|Y) = (P(X_i)P(Y|X_i)/(sum_j P(X_j)P(Y|X_j))

I'm obviously skipping a lot of steps here because I'm sketching an explanation rather than giving one.

oasisaimlessly 859 days ago

> The probability of any particular real number arising from a probability distribution on the real numbers is exactly 0. It's not an arbitrarily small epsilon greater than zero, it's actually zero.

Nitpicking somewhat, but e.g. `max(1, uniform(0, 2))` has a very non-zero probability of evaluating to 1.