Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research.

Original languageEnglish (US)
Pages (from-to)104-113
Number of pages10
JournalGenetic Epidemiology
Volume38
Issue number2
DOIs
StatePublished - Feb 2014

Fingerprint

Exome
Genes
Exons
Genome-Wide Association Study
Sample Size
Technology
Sensitivity and Specificity

Keywords

  • Association analysis
  • Exome sequencing
  • LASSO
  • Rare variants
  • Regularization

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data. / Larson, Nicholas; Schaid, Daniel J.

In: Genetic Epidemiology, Vol. 38, No. 2, 02.2014, p. 104-113.

Research output: Contribution to journalArticle

@article{4dda80272d7a4807be265e95d8c15727,
title = "Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data",
abstract = "Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research.",
keywords = "Association analysis, Exome sequencing, LASSO, Rare variants, Regularization",
author = "Nicholas Larson and Schaid, {Daniel J}",
year = "2014",
month = "2",
doi = "10.1002/gepi.21783",
language = "English (US)",
volume = "38",
pages = "104--113",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "2",

}

TY - JOUR

T1 - Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data

AU - Larson, Nicholas

AU - Schaid, Daniel J

PY - 2014/2

Y1 - 2014/2

N2 - Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research.

AB - Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research.

KW - Association analysis

KW - Exome sequencing

KW - LASSO

KW - Rare variants

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=84892519541&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892519541&partnerID=8YFLogxK

U2 - 10.1002/gepi.21783

DO - 10.1002/gepi.21783

M3 - Article

C2 - 24382715

AN - SCOPUS:84892519541

VL - 38

SP - 104

EP - 113

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 2

ER -