|
|
|
|
|
by WCSTombs
45 days ago
|
|
Mathematically, it is literally a probability distribution, because it fits the definition of a measure whose total mass is one, so I think the language is just imprecise. What they may be trying to say is that semantically it doesn't arise in a principled way from an uncertainty model, such as from Bayesian or frequentist statistics. |
|
This is exactly the sense that it comes up for old school LMs and why it appears in thermodynamics.
Of course it is entirely possible that newfangled ML people use it without understanding that it is derived from first principles - i.e. see article.