Hacker News new | ask | show | jobs
by trthatcher 4554 days ago
You have the right idea.

You're assuming an underlying model for the data. You have a test statistic( that estimates a model parameter) and you have a hypothesis regarding a parameter. The p-value is the probability that you get a test statistic more extreme than the one observed assuming that your hypothesis is true.

Ex. You have a sample of 1000 men's heights. You compute the sample average height as 5'9 and a sample standard deviation of 3 inches.

(Unlikely) hypothesis: the average height is 4 feet. Your p-value is the probability of getting an sample average more extreme than 5'9 given that your 4 ft height hypothesis is true. Given that the sample standard deviation is 3 inches and 5'9 is 7 standard deviations from 4ft... the p-value is going to be small, so you'll reject that.

Note: I'm leaving out details and assumptions

1 comments

Yeah, that's what I was taught. So what are people answering instead?
People can be easily lead to misinterpret p-values even if they can define them. Most often people assume that p values indicate something about the correctness of a model or an inference. This is the classic p(d|h) v p(h|d) debate.