A principal component approach to improve association testing with polygenic risk scores

Brandon J. Coombes; Alexander Ploner; Sarah E. Bergen; Joanna M. Biernacka

doi:10.1002/gepi.22339

A principal component approach to improve association testing with polygenic risk scores

Brandon J. Coombes, Alexander Ploner, Sarah E. Bergen, Joanna M. Biernacka

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS–PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS–PCA approach with a permutation test and an a priori selected p-value threshold. The PRS–PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.

Original language	English (US)
Pages (from-to)	676-686
Number of pages	11
Journal	Genetic epidemiology
Volume	44
Issue number	7
DOIs	https://doi.org/10.1002/gepi.22339
State	Published - Oct 1 2020

Keywords

permutation
polygenic risk scores
principal component analysis
weighting

ASJC Scopus subject areas

Epidemiology
Genetics(clinical)

Access to Document

10.1002/gepi.22339

Cite this

@article{e7af6079cbfa4d64b2aacd3e288e37db,

title = "A principal component approach to improve association testing with polygenic risk scores",

abstract = "Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS–PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS–PCA approach with a permutation test and an a priori selected p-value threshold. The PRS–PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.",

keywords = "permutation, polygenic risk scores, principal component analysis, weighting",

author = "Coombes, {Brandon J.} and Alexander Ploner and Bergen, {Sarah E.} and Biernacka, {Joanna M.}",

note = "Publisher Copyright: {\textcopyright} 2020 Wiley Periodicals LLC",

year = "2020",

month = oct,

day = "1",

doi = "10.1002/gepi.22339",

language = "English (US)",

volume = "44",

pages = "676--686",

journal = "Genetic epidemiology",

issn = "0741-0395",

publisher = "Wiley-Liss Inc.",

number = "7",

}

TY - JOUR

T1 - A principal component approach to improve association testing with polygenic risk scores

AU - Coombes, Brandon J.

AU - Ploner, Alexander

AU - Bergen, Sarah E.

AU - Biernacka, Joanna M.

PY - 2020/10/1

Y1 - 2020/10/1

N2 - Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS–PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS–PCA approach with a permutation test and an a priori selected p-value threshold. The PRS–PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.

AB - Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS–PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS–PCA approach with a permutation test and an a priori selected p-value threshold. The PRS–PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.

KW - permutation

KW - polygenic risk scores

KW - principal component analysis

KW - weighting

UR - http://www.scopus.com/inward/record.url?scp=85088257816&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85088257816&partnerID=8YFLogxK

U2 - 10.1002/gepi.22339

DO - 10.1002/gepi.22339

M3 - Article

C2 - 32691445

AN - SCOPUS:85088257816

SN - 0741-0395

VL - 44

SP - 676

EP - 686

JO - Genetic epidemiology

JF - Genetic epidemiology

IS - 7

ER -

A principal component approach to improve association testing with polygenic risk scores

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this