|
I think most of the replies, here and on stack exchange, are answering slightly the wrong question. It is fair to ask why the likelihoods are useful if they are so small, and it's not a good answer to talk about how they could be expressed as logs, or even to talk about the properties of continuous distributions. I think the answer is: Yes, individual likelihoods are so small, that yes even a MLE solution is extremely unlikely to be correct. However, the idea is that often a lot of the probability mass - an amount that is not small - will be concentrated around the maximum likelihood estimate, and so that's why it makes a good estimate, and worth using. Much like how the average is unlikely to be the exact value of a new sample from the distribution, but it's a good way of describing what to expect. (And gets better if you augment it with some measure of dispersion, and so on). (If the distribution is very dispersed, then while the average is less useful as an idea of what to expect, it still minimises prediction error in some loss; but that's a different thing and I think less relevant here). |
The way the question demonstrates "smallness" is wrong, however. They quote the product of the likelihoods of 50 randomly sampled values - 9.183016e-65 - as if the smallness of this value is significant or meant anything at all. Forget the issue of continuous sampling from a normal distribution, and just consider the simple discrete case of flipping a coin. The combined probability of any permutation of 50 flips is 0.5 ^ 50, a really small number. That's because the probability is, in fact, really small!