Abstract
Population stratification continues to bias the results of genome-wide association
studies (GWAS). When these results are used to construct polygenic scores, even subtle biases can
cumulatively lead to large errors. To study the effect of residual stratification, we simulated GWAS
under realistic models of demographic history. We show that when population structure is recent, it
cannot be corrected using principal components of common variants because they are
uninformative about recent history. Consequently, polygenic scores are biased in that they
recapitulate environmental structure. Principal components calculated from rare variants or
identity-by-descent segments can correct this stratification for some types of environmental
effects. While family-based studies are immune to stratification, the hybrid approach of
ascertaining variants in GWAS but reestimating effect sizes in siblings reduces but does not
eliminate stratification. We show that the effect of population stratification depends not only on
allele frequencies and environmental structure but also on demographic history.
Users
Please
log in to take part in the discussion (add own reviews or comments).