|
|
|
|
|
by justQandA
2054 days ago
|
|
This looks really neat but I unfortunately don't have the necessary background knowledge to understand what this is even visualizing. Is there a tutorial to help acquire the necessary math/background knowledge to be able to comprehend what's being modeled? For example, what are "Power", "Alpha", "n", and "d"? Also "Type I" and "Type II" errors? What's the intuition as to how these all relate each other? Is there a blog post, chapter, that explains all of this? Or would this require significant learning i.e. a full semester course or a textbook. |
|
To set the context, we are trying to use data to help us test a hypothesis. An example might be: "if we give this pill to a person, they will be cured of their disease". Statisticians test this by setting up two groups: Group A gets nothing (or a placebo), Group B gets the pill.
In statistics, you assume the "Null Hypothesis", in other words, that there is no difference between the two groups. You use hypothesis testing to help you "reject" the null hypothesis, to say that the groups are actually different. If the groups are different, that means the pill cures the disease. So we take a bunch of data about the two groups, run some math on that data, and use the result of that math to help us decide if we can reject the null hypothesis.
Statistics is a bunch of tradeoffs between certainty, making the wrong call, and data volume. The terms you have mentioned are either "knobs" (tradeoffs) we can make or measures that helps us understand our results.
Here's what those terms mean:
Type 1 Error: also known as "False Positive". You thought the pill cured the disease, but it does not.
Type 2 Error: also known as "False Negative". You thought the pill did nothing, but it actually works.
Power: the chance to avoid Type 2 error (false negative). The higher your power value, the lower chance you incorrectly assume your pill is ineffective.
N: The number of "observations", in our case, the number of patients in the trial for our pill.
The others are a little trickier to explain.
Alpha: Statisticians use a "confidence interval" as a way to communicate how uncertain they are about a particular result. In our trial we might say "patients were 15% less likely to have the disease after taking the pill, give or take 2%". We don't think the decrease is exactly 15% (what we observed) but is instead somewhere in that neighborhood. Alpha is a measure of the chance the real effect is OUTSIDE of your confidence interval. So in this case, the chance the effect is < 13% or > 17%.
Cohen's D: In our trial, we might measure "the number of times the patient coughed in a day" in addition to "do they have the disease anymore yes/no". In order to compare our two groups, we make look at the average number of coughs per day in group A vs group B. This is called measuring the "difference in means". Cohen's D is a formula to measure the difference in means that also encodes your uncertainty in the result.