|
|
|
|
|
by gwern
1493 days ago
|
|
> You can choose to make them less noisy by restricting to significant SNPs, for example. That makes them more noisy, not less. PGS predictive power for EDU/IQ is always maximized at use of all SNPs. Restricting to the arbitrary subset of genome-wide statistically-significant SNPs in Lee would drive it from the 7% or so they have to <1%, IIRC. Also, neither of your two problems are the problem here, as the biases there would not be expected to drive a correlation between video game playing & IQ (what sort of within-ethnic interaction would you need for that and why is it plausible?), and would mostly serve to simply not control for intelligence (and quantitatively, because the PGS here is a small fraction of the variance, even gross biases which somehow did manage to drive correlations between those two variables, would still be unable to meaningfully affect the estimates). |
|
Using only genome-wide significant SNPs reduces the amount of variance explained by the polygenic score, which is what you describe and I agree with. My comment about the concern about "noise" is with respect to a sibling comment ("Polygenic scores are powerful, but they contain very large amounts of noise compared to the true genetic effects.") That is the "noise" that I was addressing. And just as you say, the noise is, essentially, a worthwhile cost to pay since it should not be directional, and so we use various approaches to include thousands or millions of SNPs in these scores.
> Also, neither of your two problems are the problem here
I don't agree. These problems occur very clearly in any mixed-ancestry analyses, and they have to be carefully accounted for or else they induce between-ancestry bias. It's not a function of the phenotype itself (i.e., I'm not making a comment about intelligence); this is true for all polygenic scores.