Hacker News new | ask | show | jobs
by sokoloff 1343 days ago
It seems to me with electronic medical records that you could do population-wide studies using data that already exists (and I think that would pass IRB, or at least "ought to"). That would likely tell you "for the patients matching criteria X (are covered by BCBS and live in state X, or whatever), the 10 year outcome for patients who turned 50 in 2010 was Y vs Z conditioned on whether they had a colonoscopy".
1 comments

What would you be trying to learn from that? You're comparing unlike groups. You can't draw any conclusions.
If I look at all people born in 1960, living in NY, covered by BCBS, that’s a like group.

If I then split by “had a colonoscopy between 2008 and 2012, inclusive” vs “didn’t” and look at 2012 through 2021 outcomes to draw conclusions, it’s possible that that filtering makes them unlike groups (I mean, it definitionally does in at least the primary selection criteria). Given that the effort is approximately that of a SQL query, I’d be interested to know if there’s a possible signal there, which would need to be corroborated with other data sets to determine the repeatability of the correlation and then if there’s any likely causal link.