Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes

Wonil Chung, Jun Chen, Constance Turman, Sara Lindstrom, Zhaozhong Zhu, Po Ru Loh, Peter Kraft, Liming Liang

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R 2 = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.

Original languageEnglish (US)
Article number569
JournalNature communications
Volume10
Issue number1
DOIs
StatePublished - Dec 1 2019

Fingerprint

phenotype
Genome-Wide Association Study
regression analysis
penalties
Phenotype
Parallel processing systems
predictions
penalty function
Statistics
Multifactorial Inheritance
statistics
Single Nucleotide Polymorphism
Genotype
alachlor

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Cite this

Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes. / Chung, Wonil; Chen, Jun; Turman, Constance; Lindstrom, Sara; Zhu, Zhaozhong; Loh, Po Ru; Kraft, Peter; Liang, Liming.

In: Nature communications, Vol. 10, No. 1, 569, 01.12.2019.

Research output: Contribution to journalArticle

Chung, Wonil ; Chen, Jun ; Turman, Constance ; Lindstrom, Sara ; Zhu, Zhaozhong ; Loh, Po Ru ; Kraft, Peter ; Liang, Liming. / Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes. In: Nature communications. 2019 ; Vol. 10, No. 1.
@article{cf0f1e9095904cde9111faa14a699135,
title = "Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes",
abstract = "We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R 2 = 35.8{\%} (MTAG) to 42.5{\%} (MCP + CTPR) or 42.8{\%} (Lasso + CTPR) with UK Biobank data.",
author = "Wonil Chung and Jun Chen and Constance Turman and Sara Lindstrom and Zhaozhong Zhu and Loh, {Po Ru} and Peter Kraft and Liming Liang",
year = "2019",
month = "12",
day = "1",
doi = "10.1038/s41467-019-08535-0",
language = "English (US)",
volume = "10",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes

AU - Chung, Wonil

AU - Chen, Jun

AU - Turman, Constance

AU - Lindstrom, Sara

AU - Zhu, Zhaozhong

AU - Loh, Po Ru

AU - Kraft, Peter

AU - Liang, Liming

PY - 2019/12/1

Y1 - 2019/12/1

N2 - We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R 2 = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.

AB - We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R 2 = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.

UR - http://www.scopus.com/inward/record.url?scp=85061057577&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061057577&partnerID=8YFLogxK

U2 - 10.1038/s41467-019-08535-0

DO - 10.1038/s41467-019-08535-0

M3 - Article

C2 - 30718517

AN - SCOPUS:85061057577

VL - 10

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

IS - 1

M1 - 569

ER -