Yes it does. Think of typical notions of statistical significance when testing one new idea prospectively, say the concept of a p-value, or the AUC used in the paper. Now think instead of a rich dataset and you are free to fish for any of the possibly tens of thousands of signals for one signal or a combination of signals that match your result. Loosely speaking you are overfitting and the threshold for being surprised or having statistical significance is now much more strict.
Sure, but let's say that we test this and it is predictive on new data (not overfitting), but we have no idea at all how it works. It's still a useful test.
The retrospective regression on a specific dataset might discover a true correlated quantity, if any true correlated quantities were there and their signal was more prominent than the combinations you get from the noise. However, this analysis will always discover a quantity that correlates, by design. These retrospective studies can prompt prospective studies for a correlated quantity (a biomarker in this case) and the careful analysis of the retrospective study methodologies and results can suggest the design of such prospective studies; if a prospective study works, then that is fantastic. The retrospective studies are mostly there for statisticians to figure things out for future tests, except when the signal is simple and phenomenal.
I guess one issue is that our environment changes so that what was predictive for the past isn't for the present day.
Can gene expression be affected by pollutants more common decades ago like abestos, coal-dust or leaded petrol? It would be frustrating to only discover this in 15 years time.
https://en.wikipedia.org/wiki/Bonferroni_correction