| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rwmj 862 days ago
	If it's predictive, does it matter?

2 comments

pama 861 days ago

Yes it does. Think of typical notions of statistical significance when testing one new idea prospectively, say the concept of a p-value, or the AUC used in the paper. Now think instead of a rich dataset and you are free to fish for any of the possibly tens of thousands of signals for one signal or a combination of signals that match your result. Loosely speaking you are overfitting and the threshold for being surprised or having statistical significance is now much more strict.

https://en.wikipedia.org/wiki/Bonferroni_correction

link

rwmj 861 days ago

Sure, but let's say that we test this and it is predictive on new data (not overfitting), but we have no idea at all how it works. It's still a useful test.

link

pama 861 days ago

The retrospective regression on a specific dataset might discover a true correlated quantity, if any true correlated quantities were there and their signal was more prominent than the combinations you get from the noise. However, this analysis will always discover a quantity that correlates, by design. These retrospective studies can prompt prospective studies for a correlated quantity (a biomarker in this case) and the careful analysis of the retrospective study methodologies and results can suggest the design of such prospective studies; if a prospective study works, then that is fantastic. The retrospective studies are mostly there for statisticians to figure things out for future tests, except when the signal is simple and phenomenal.

link

Fluorescence 861 days ago

I guess one issue is that our environment changes so that what was predictive for the past isn't for the present day.

Can gene expression be affected by pollutants more common decades ago like abestos, coal-dust or leaded petrol? It would be frustrating to only discover this in 15 years time.

link