Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data

Daniel J Schaid, Shannon K. Mcdonnell, Jason P. Sinnwell, Stephen N Thibodeau

Research output: Contribution to journalArticle

56 Citations (Scopus)

Abstract

Searching for rare genetic variants associated with complex diseases can be facilitated by enriching for diseased carriers of rare variants by sampling cases from pedigrees enriched for disease, possibly with related or unrelated controls. This strategy, however, complicates analyses because of shared genetic ancestry, as well as linkage disequilibrium among genetic markers. To overcome these problems, we developed broad classes of "burden" statistics and kernel statistics, extending commonly used methods for unrelated case-control data to allow for known pedigree relationships, for autosomes and the X chromosome. Furthermore, by replacing pedigree-based genetic correlation matrices with estimates of genetic relationships based on large-scale genomic data, our methods can be used to account for population-structured data. By simulations, we show that the type I error rates of our developed methods are near the asymptotic nominal levels, allowing rapid computation of P-values. Our simulations also show that a linear weighted kernel statistic is generally more powerful than a weighted "burden" statistic. Because the proposed statistics are rapid to compute, they can be readily used for large-scale screening of the association of genomic sequence data with disease status.

Original languageEnglish (US)
Pages (from-to)409-418
Number of pages10
JournalGenetic Epidemiology
Volume37
Issue number5
DOIs
StatePublished - Jul 2013

Fingerprint

Pedigree
Population
Linkage Disequilibrium
X Chromosome
Rare Diseases
Genetic Markers

Keywords

  • Burden test
  • Genome sequence data
  • Kernel statistic
  • Pedigree data
  • Rare variants

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data. / Schaid, Daniel J; Mcdonnell, Shannon K.; Sinnwell, Jason P.; Thibodeau, Stephen N.

In: Genetic Epidemiology, Vol. 37, No. 5, 07.2013, p. 409-418.

Research output: Contribution to journalArticle

@article{a9a7b8fa59464b04a4dc654182e9dc4e,
title = "Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data",
abstract = "Searching for rare genetic variants associated with complex diseases can be facilitated by enriching for diseased carriers of rare variants by sampling cases from pedigrees enriched for disease, possibly with related or unrelated controls. This strategy, however, complicates analyses because of shared genetic ancestry, as well as linkage disequilibrium among genetic markers. To overcome these problems, we developed broad classes of {"}burden{"} statistics and kernel statistics, extending commonly used methods for unrelated case-control data to allow for known pedigree relationships, for autosomes and the X chromosome. Furthermore, by replacing pedigree-based genetic correlation matrices with estimates of genetic relationships based on large-scale genomic data, our methods can be used to account for population-structured data. By simulations, we show that the type I error rates of our developed methods are near the asymptotic nominal levels, allowing rapid computation of P-values. Our simulations also show that a linear weighted kernel statistic is generally more powerful than a weighted {"}burden{"} statistic. Because the proposed statistics are rapid to compute, they can be readily used for large-scale screening of the association of genomic sequence data with disease status.",
keywords = "Burden test, Genome sequence data, Kernel statistic, Pedigree data, Rare variants",
author = "Schaid, {Daniel J} and Mcdonnell, {Shannon K.} and Sinnwell, {Jason P.} and Thibodeau, {Stephen N}",
year = "2013",
month = "7",
doi = "10.1002/gepi.21727",
language = "English (US)",
volume = "37",
pages = "409--418",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "5",

}

TY - JOUR

T1 - Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data

AU - Schaid, Daniel J

AU - Mcdonnell, Shannon K.

AU - Sinnwell, Jason P.

AU - Thibodeau, Stephen N

PY - 2013/7

Y1 - 2013/7

N2 - Searching for rare genetic variants associated with complex diseases can be facilitated by enriching for diseased carriers of rare variants by sampling cases from pedigrees enriched for disease, possibly with related or unrelated controls. This strategy, however, complicates analyses because of shared genetic ancestry, as well as linkage disequilibrium among genetic markers. To overcome these problems, we developed broad classes of "burden" statistics and kernel statistics, extending commonly used methods for unrelated case-control data to allow for known pedigree relationships, for autosomes and the X chromosome. Furthermore, by replacing pedigree-based genetic correlation matrices with estimates of genetic relationships based on large-scale genomic data, our methods can be used to account for population-structured data. By simulations, we show that the type I error rates of our developed methods are near the asymptotic nominal levels, allowing rapid computation of P-values. Our simulations also show that a linear weighted kernel statistic is generally more powerful than a weighted "burden" statistic. Because the proposed statistics are rapid to compute, they can be readily used for large-scale screening of the association of genomic sequence data with disease status.

AB - Searching for rare genetic variants associated with complex diseases can be facilitated by enriching for diseased carriers of rare variants by sampling cases from pedigrees enriched for disease, possibly with related or unrelated controls. This strategy, however, complicates analyses because of shared genetic ancestry, as well as linkage disequilibrium among genetic markers. To overcome these problems, we developed broad classes of "burden" statistics and kernel statistics, extending commonly used methods for unrelated case-control data to allow for known pedigree relationships, for autosomes and the X chromosome. Furthermore, by replacing pedigree-based genetic correlation matrices with estimates of genetic relationships based on large-scale genomic data, our methods can be used to account for population-structured data. By simulations, we show that the type I error rates of our developed methods are near the asymptotic nominal levels, allowing rapid computation of P-values. Our simulations also show that a linear weighted kernel statistic is generally more powerful than a weighted "burden" statistic. Because the proposed statistics are rapid to compute, they can be readily used for large-scale screening of the association of genomic sequence data with disease status.

KW - Burden test

KW - Genome sequence data

KW - Kernel statistic

KW - Pedigree data

KW - Rare variants

UR - http://www.scopus.com/inward/record.url?scp=84879128720&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879128720&partnerID=8YFLogxK

U2 - 10.1002/gepi.21727

DO - 10.1002/gepi.21727

M3 - Article

C2 - 23650101

AN - SCOPUS:84879128720

VL - 37

SP - 409

EP - 418

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 5

ER -