Hacker News new | ask | show | jobs
by yccs27 1525 days ago
This is the correct answer.

In other words, "z sigma" means: That a result like this occurs as a statistical fluke, is just as likely as a standard-normal distributed variable giving a value above z.

1 comments

I would add: If the null hypothesis is true, then "the result like this... (in this case the null hypothesis is of cause that the standard model is true)
If the null hypothesis were true, and the experiment were repeated infinite number of times with a different sample each time then "the result like this or more extreme ...
I agree with adding the "more extreme" part, but I'm not so sure about the infinite number of times part. Certainly, the p-value is (roughly speaking) the probability of seeing a result at least as extreme as the observed result, under the null hypothesis. But one doesn't really need to introduce hypothetical infinite sequences of replications to make sense of that definition.
Isn't the bit about repeating the study over and over again the whole basis of frequentist statistics, though? (Indeed isn't that why it's called frequentism?)
Sort of. You don't need identical replications of the same experiment, just long run probabilities for any application of the method. See example two here: https://normaldeviate.wordpress.com/2012/11/17/what-is-bayes...

(The author is a stats professor at CMU.)

Quoting: "The plot shows the first 50 simulations. In the first simulation I picked some distribution {F_1}. Let {\theta_1} be the median of {F_1}. I generated {n=100} observations from {F_1} and then constructed the interval. The confidence interval is the first vertical line. The true value is the dot. For the second simulation, I chose a different distribution {F_2}. Then I generated the data and constructed the interval. I did this many times, each time using a different distribution with a different true median. The blue interval shows the one time that the confidence interval did not trap the median. I did this 10,000 times (only 50 are shown). The interval covered the true value 94.33 % of the time. I wanted to show this plot because, when some texts show confidence interval simulations like this they use the same distribution for each trial. This is unnecessary and it gives the false impression that you need to repeat the same experiment in order to discuss coverage."

Yeah, that's what I remember from grad school. Thanks for the link!
What's that in Bayesian terms?
The probability of N(1,1) emitting >= 7. (So, one minus the CDF of the normal distribution at 7)