Hacker News new | ask | show | jobs
by cshimmin 1530 days ago
I work in this field (different experiment); despite the downvotes this is a reasonable question. Reposting my comment from above, since there is confusion here (the other sibling comments are incorrect).

In particle physics, sigma denotes "significance", not standard deviation. Technically what we're quoting as "sigmas" are "z-values", where z=Phi^{-1}(1 - p), where Phi^{-1} is the inverse CDF of the Normal distribution and p is the p-value of the experimental result. So, 7 sigma is defined to be the level of significance (for an arbitrary distribution) corresponding to the same quantile as 7 standard deviations out in a Normal distribution.

3 comments

This is the correct answer.

In other words, "z sigma" means: That a result like this occurs as a statistical fluke, is just as likely as a standard-normal distributed variable giving a value above z.

I would add: If the null hypothesis is true, then "the result like this... (in this case the null hypothesis is of cause that the standard model is true)
If the null hypothesis were true, and the experiment were repeated infinite number of times with a different sample each time then "the result like this or more extreme ...
I agree with adding the "more extreme" part, but I'm not so sure about the infinite number of times part. Certainly, the p-value is (roughly speaking) the probability of seeing a result at least as extreme as the observed result, under the null hypothesis. But one doesn't really need to introduce hypothetical infinite sequences of replications to make sense of that definition.
Isn't the bit about repeating the study over and over again the whole basis of frequentist statistics, though? (Indeed isn't that why it's called frequentism?)
Sort of. You don't need identical replications of the same experiment, just long run probabilities for any application of the method. See example two here: https://normaldeviate.wordpress.com/2012/11/17/what-is-bayes...

(The author is a stats professor at CMU.)

Quoting: "The plot shows the first 50 simulations. In the first simulation I picked some distribution {F_1}. Let {\theta_1} be the median of {F_1}. I generated {n=100} observations from {F_1} and then constructed the interval. The confidence interval is the first vertical line. The true value is the dot. For the second simulation, I chose a different distribution {F_2}. Then I generated the data and constructed the interval. I did this many times, each time using a different distribution with a different true median. The blue interval shows the one time that the confidence interval did not trap the median. I did this 10,000 times (only 50 are shown). The interval covered the true value 94.33 % of the time. I wanted to show this plot because, when some texts show confidence interval simulations like this they use the same distribution for each trial. This is unnecessary and it gives the false impression that you need to repeat the same experiment in order to discuss coverage."

What's that in Bayesian terms?
The probability of N(1,1) emitting >= 7. (So, one minus the CDF of the normal distribution at 7)
> sigma denotes "significance", not standard deviation.

Nitpick: this is still a standard deviation in some (potentially very contrived and nonlinear) coordinate system. (As a simple example, a log-normal distribution might have a mean of 1 and a standard deviation effectively of multiplying or dividing by 2. Edit: also, multidimensional stuff might have to be shoehorned into a polar coordinate system.) But in practice you'd never bother to construct such a coordinate system, so that's more a mathematical artifact than anything useful.

No, there is no coordinate system. This is referring to the distribution of a test statistic for hypothesis testing. It's a 1-d real scalar, and coordinate transforms don't have any meaningful statistical representation. Of course there are much higher-dimensional distributions, in all sorts of coordinate systems, involved in sampling the test statistic, but at the end of the day this is all you are left with. If you change the underlying distributions of the model, then of course you will change the test statistic distribution, but that's meaningless, since the whole point of the test statistic is to quantify an observation in the context of a given model.

Anyway, as I mentioned elsewhere, the motivation for calling it sigma is that, by construction, it maps onto the quantiles of the standard Normal distribution. So an N-sigma result will have the same p-value as N standard deviations in a Normal distribution. So you can associate "sigmas" with "standard deviations of the Normal distribution". Perhaps this is what you are trying to say, but it does not make sigma a standard deviation in any statistical sense, i.e. it is not necessarily related to the variance of the relevant distribution.

oh wow, thanks for pointing this out :)
For what it's worth, sigma is chosen for this purpose specifically to evoke the notion of "standard deviations". But quoting the std dev. directly is useless, since the distribution is unspecified. So we "convert" the statistical significance to the corresponding number of standard deviations of the Normal distribution, since that is a familiar distribution. If you like, it's another way of stating p-values, which physicists prefer because ours can have lots of zeros :)