|
|
|
|
|
by TheMrZZ
670 days ago
|
|
Biggest trap of Simpson's paradox is the results can change with every level of granularity. If you take the example of Treatment A vs Treatment B for tumors, you can get infinite layers of seemingly contradicting statemens:
- Overall, Treatment A has better average results
- But if you add tumor size, Treatment B is always better
- But if you add gender to size, Treatment B is always better
- But if you add age category to gender and size, Treatment A is always better
- etc... It totally contradicts our instincts, and shows statistics can be profoundly misleading (intentionally or not). |
|
Results can be found in this GSheet: https://docs.google.com/spreadsheets/d/1tsBhElTgXjVTeas8quar...
Code is here: https://gist.github.com/TheMrZZ/c33927ca2cc917997a67d7f84b82...
I'm currently running the 3-variables version, hopefully I'll get results this afternoon.
We can clearly see the same problems that arise in the 1-variable Simpson's paradox (widely different population sizes).