Hacker News new | ask | show | jobs
by mb7733 673 days ago
I think the real-world resolution to this problem is straightforward though. You should look at the finest level of granularity available, and pick the best treatment in the relevant subpopulation for the patient.
2 comments

Unfortunately our level of certainty generally falls off as we increase the granularity. For example, imagine the patient is a 77yo Polish-American man, and we're lucky enough to have one historical result for 77yo Polish-American men. That man got treatment A and did better than expected. But say if we go out to 70-79y white men we have 1,000 people, of which 500 got treatment A and generally did significantly worse than the 500 who got treatment B. While the more granular category gives us a little information, the sample size is so small that we would be foolish to discard the less granular information.
This is all true. I originally added a disclaimer to my post that said "assuming you have enough data to support the level of granularity" but I removed it for brevity because I thought it was implied -- small sample size isn't part of Simpson's paradox. My apologies for being unclear
The smaller the subpopulation, the higher the variance, and the less significant the result.