|
|
|
|
|
by yummyfajitas
3649 days ago
|
|
I have plenty of criticisms of hypothesis testing and p-values. Nevertheless, if you choose to run that type of analysis, do it right - this means sticking with your analysis and not using weasel words like "almost statistically significant" when it doesn't come out the way you want. Incidentally, the real p-value is 11.075% since they ran two hypothesis tests and didn't adjust for multiple comparisons. Your analysis might be right - if so, that's interesting. I'll take a closer look and write a followup piece if true - among other things glancing at your ROC curve suggests they are pretty close, and perform better for whites in some regions and better for blacks in others. But it's 7:30AM (pre-coffee) and I haven't looked closely yet. But since PP did not do any of this, my criticism of them holds - they ran an NHST, got the wrong result, and then spouted a bunch of anecdotes instead of admitting that their analysis went against what they wanted to find. |
|
> Finally, the article includes a table of false positive probabilities (FPP) and false negative probabilities (FNP). This may or may not be evidence of bias - the authors would need to run a statistical test to determine that, which they don't. In fact, I can't even find the place in their R notebook where they did that calculation. Is this the result of bad statistics? Is it merely random chance? Who knows!
Looking at PP's Jupyter Notebook, the calculations seem to be performed at lines 50 onwards (if you're referring to the table that I think you're referring to).
FWIW, those "weasel words" you allege are in the writeup of the methodology, where the audience is expected to follow along and see how the 0.057 is calculated. I'm not sure how you're interpreting that calculation...My read is that it's not the bedrock from which all of the other analyses are based from. Where in the story do you see that particular calculation being used as the main (or even ancillary) thrust of the piece?