Hacker News new | ask | show | jobs
by sokoloff 1343 days ago
If colonoscopies are helpful to avoid negative outcomes, but invitations to colonoscopies are not, looking into making invitations better seems like an obvious play to me.
1 comments

The problem is you can't run a study on colonoscopies that doesn't involve an invitation without forcing people to get a colonoscopy at gun point. Which would both be illegal and never pass an IRB.
> Which would both be illegal and never pass an IRB.

To be fair, this is not two separate problems. Anyone who's willing to run the illegal study will not care whether they can get IRB approval.

You can control for the invitation vs no-invitation and colonoscopy vs no-colonoscopy and analyze the outcomes of all four cells (provided you have enough people in each cell) or the column or row independently.
You can do that, but you won't learn anything, because the populations within each cell are not comparable to each other.
It seems to me with electronic medical records that you could do population-wide studies using data that already exists (and I think that would pass IRB, or at least "ought to"). That would likely tell you "for the patients matching criteria X (are covered by BCBS and live in state X, or whatever), the 10 year outcome for patients who turned 50 in 2010 was Y vs Z conditioned on whether they had a colonoscopy".
What would you be trying to learn from that? You're comparing unlike groups. You can't draw any conclusions.
If I look at all people born in 1960, living in NY, covered by BCBS, that’s a like group.

If I then split by “had a colonoscopy between 2008 and 2012, inclusive” vs “didn’t” and look at 2012 through 2021 outcomes to draw conclusions, it’s possible that that filtering makes them unlike groups (I mean, it definitionally does in at least the primary selection criteria). Given that the effort is approximately that of a SQL query, I’d be interested to know if there’s a possible signal there, which would need to be corroborated with other data sets to determine the repeatability of the correlation and then if there’s any likely causal link.