Hacker News new | ask | show | jobs
by bglazer 862 days ago
One caveat is that this is necessarily a retrospective study. The authors looked at historical data from the UK Biobank, ran a regression, and found these genes. So, it’s not really clear if these genes are actually causing dementia, or if they’re just related to a common cause, or (most likely) tied together through some very complicated web of biological mechanisms.

That said, they’re interesting genes. GFAP is expressed in astrocytes, which are glial cells. These perform a lot of tasks in the brain, but they seem important in protecting neurons from toxic stresses. So, this may give us some additional insight into the role of glial cells like astrocytes in dementia.

2 comments

If it's predictive, does it matter?
Yes it does. Think of typical notions of statistical significance when testing one new idea prospectively, say the concept of a p-value, or the AUC used in the paper. Now think instead of a rich dataset and you are free to fish for any of the possibly tens of thousands of signals for one signal or a combination of signals that match your result. Loosely speaking you are overfitting and the threshold for being surprised or having statistical significance is now much more strict.

https://en.wikipedia.org/wiki/Bonferroni_correction

Sure, but let's say that we test this and it is predictive on new data (not overfitting), but we have no idea at all how it works. It's still a useful test.
The retrospective regression on a specific dataset might discover a true correlated quantity, if any true correlated quantities were there and their signal was more prominent than the combinations you get from the noise. However, this analysis will always discover a quantity that correlates, by design. These retrospective studies can prompt prospective studies for a correlated quantity (a biomarker in this case) and the careful analysis of the retrospective study methodologies and results can suggest the design of such prospective studies; if a prospective study works, then that is fantastic. The retrospective studies are mostly there for statisticians to figure things out for future tests, except when the signal is simple and phenomenal.
I guess one issue is that our environment changes so that what was predictive for the past isn't for the present day.

Can gene expression be affected by pollutants more common decades ago like abestos, coal-dust or leaded petrol? It would be frustrating to only discover this in 15 years time.

But it is not a retrospective study? They took blood samples at baseline from all participants, and later on some developed dementia?
For this to be prospective rather than retrospective, they would have developed the risk model beforehand.