Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data

MGS (Molecular Genetics of Schizophrenia) GWAS Consortium, GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium), The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium, PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium, PanScan Consortium, The GAME-ON/ELLIPSE Consortium

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner’s-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner’s curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25–50% increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner’s curse correction improved prediction R2from 2.29% based on the standard PRS to 3.10% (P = 0.0017) and incorporating functional annotation data further improved R2to 3.53% (P = 2×10−5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.

Original languageEnglish (US)
Article numbere1006493
JournalPLoS Genetics
Volume12
Issue number12
DOIs
StatePublished - Dec 1 2016

Fingerprint

Genome-Wide Association Study
genome
single nucleotide polymorphism
Single Nucleotide Polymorphism
polymorphism
modeling
prediction
heritability
diabetes
Linkage Disequilibrium
linkage disequilibrium
genome-wide association study
noninsulin-dependent diabetes mellitus
disequilibrium
Type 2 Diabetes Mellitus
genetic improvement
methodology
Weights and Measures
simulation

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics
  • Genetics(clinical)
  • Cancer Research

Cite this

MGS (Molecular Genetics of Schizophrenia) GWAS Consortium, GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium), The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium, PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium, PanScan Consortium, & The GAME-ON/ELLIPSE Consortium (2016). Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data. PLoS Genetics, 12(12), [e1006493]. https://doi.org/10.1371/journal.pgen.1006493

Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data. / MGS (Molecular Genetics of Schizophrenia) GWAS Consortium; GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium); The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium; PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium; PanScan Consortium; The GAME-ON/ELLIPSE Consortium.

In: PLoS Genetics, Vol. 12, No. 12, e1006493, 01.12.2016.

Research output: Contribution to journalArticle

MGS (Molecular Genetics of Schizophrenia) GWAS Consortium, GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium), The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium, PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium, PanScan Consortium & The GAME-ON/ELLIPSE Consortium 2016, 'Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data', PLoS Genetics, vol. 12, no. 12, e1006493. https://doi.org/10.1371/journal.pgen.1006493
MGS (Molecular Genetics of Schizophrenia) GWAS Consortium, GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium), The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium, PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium, PanScan Consortium, The GAME-ON/ELLIPSE Consortium. Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data. PLoS Genetics. 2016 Dec 1;12(12). e1006493. https://doi.org/10.1371/journal.pgen.1006493
MGS (Molecular Genetics of Schizophrenia) GWAS Consortium ; GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium) ; The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium ; PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium ; PanScan Consortium ; The GAME-ON/ELLIPSE Consortium. / Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data. In: PLoS Genetics. 2016 ; Vol. 12, No. 12.
@article{29052aa1e8a947529642d63506687402,
title = "Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data",
abstract = "Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner’s-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner’s curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25–50{\%} increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner’s curse correction improved prediction R2from 2.29{\%} based on the standard PRS to 3.10{\%} (P = 0.0017) and incorporating functional annotation data further improved R2to 3.53{\%} (P = 2×10−5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.",
author = "{MGS (Molecular Genetics of Schizophrenia) GWAS Consortium} and {GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium)} and {The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium} and {PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium} and {PanScan Consortium} and {The GAME-ON/ELLIPSE Consortium} and Jianxin Shi and Park, {Ju Hyun} and Jubao Duan and Berndt, {Sonja T.} and Winton Moy and Kai Yu and Lei Song and William Wheeler and Xing Hua and Debra Silverman and Montserrat Garcia-Closas and Hsiung, {Chao Agnes} and Figueroa, {Jonine D.} and Cortessis, {Victoria K.} and N{\'u}ria Malats and Karagas, {Margaret R.} and Paolo Vineis and Chang, {I. Shou} and Dongxin Lin and Baosen Zhou and Adeline Seow and Keitaro Matsuo and Hong, {Yun Chul} and Caporaso, {Neil E.} and Brian Wolpin and Eric Jacobs and Petersen, {Gloria M} and Klein, {Alison P.} and Donghui Li and Harvey Risch and Sanders, {Alan R.} and Li Hsu and Schoen, {Robert E.} and Hermann Brenner and Rachael Stolzenberg-Solomon and Pablo Gejman and Qing Lan and Nathaniel Rothman and Amundadottir, {Laufey T.} and Landi, {Maria Teresa} and Levinson, {Douglas F.} and Chanock, {Stephen J.} and Nilanjan Chatterjee",
year = "2016",
month = "12",
day = "1",
doi = "10.1371/journal.pgen.1006493",
language = "English (US)",
volume = "12",
journal = "PLoS Genetics",
issn = "1553-7390",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data

AU - MGS (Molecular Genetics of Schizophrenia) GWAS Consortium

AU - GECCO (The Genetics and Epidemiology of Colorectal Cancer Consortium)

AU - The GAME-ON/TRICL (Transdisciplinary Research in Cancer of the Lung) GWAS Consortium

AU - PRACTICAL (PRostate cancer AssoCiation group To Investigate Cancer Associated aLterations) Consortium

AU - PanScan Consortium

AU - The GAME-ON/ELLIPSE Consortium

AU - Shi, Jianxin

AU - Park, Ju Hyun

AU - Duan, Jubao

AU - Berndt, Sonja T.

AU - Moy, Winton

AU - Yu, Kai

AU - Song, Lei

AU - Wheeler, William

AU - Hua, Xing

AU - Silverman, Debra

AU - Garcia-Closas, Montserrat

AU - Hsiung, Chao Agnes

AU - Figueroa, Jonine D.

AU - Cortessis, Victoria K.

AU - Malats, Núria

AU - Karagas, Margaret R.

AU - Vineis, Paolo

AU - Chang, I. Shou

AU - Lin, Dongxin

AU - Zhou, Baosen

AU - Seow, Adeline

AU - Matsuo, Keitaro

AU - Hong, Yun Chul

AU - Caporaso, Neil E.

AU - Wolpin, Brian

AU - Jacobs, Eric

AU - Petersen, Gloria M

AU - Klein, Alison P.

AU - Li, Donghui

AU - Risch, Harvey

AU - Sanders, Alan R.

AU - Hsu, Li

AU - Schoen, Robert E.

AU - Brenner, Hermann

AU - Stolzenberg-Solomon, Rachael

AU - Gejman, Pablo

AU - Lan, Qing

AU - Rothman, Nathaniel

AU - Amundadottir, Laufey T.

AU - Landi, Maria Teresa

AU - Levinson, Douglas F.

AU - Chanock, Stephen J.

AU - Chatterjee, Nilanjan

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner’s-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner’s curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25–50% increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner’s curse correction improved prediction R2from 2.29% based on the standard PRS to 3.10% (P = 0.0017) and incorporating functional annotation data further improved R2to 3.53% (P = 2×10−5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.

AB - Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner’s-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner’s curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25–50% increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner’s curse correction improved prediction R2from 2.29% based on the standard PRS to 3.10% (P = 0.0017) and incorporating functional annotation data further improved R2to 3.53% (P = 2×10−5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.

UR - http://www.scopus.com/inward/record.url?scp=85007574079&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85007574079&partnerID=8YFLogxK

U2 - 10.1371/journal.pgen.1006493

DO - 10.1371/journal.pgen.1006493

M3 - Article

C2 - 28036406

AN - SCOPUS:85007574079

VL - 12

JO - PLoS Genetics

JF - PLoS Genetics

SN - 1553-7390

IS - 12

M1 - e1006493

ER -