Nonparametric tests of association of multiple genes with human disease

Daniel J Schaid, Shannon K. McDonnell, Scott J. Hebbring, Julie M Cunningham, Stephen N Thibodeau

Research output: Contribution to journalArticle

113 Citations (Scopus)

Abstract

The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on 17-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a "kernel" function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I - error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.

Original languageEnglish (US)
Pages (from-to)780-793
Number of pages14
JournalAmerican Journal of Human Genetics
Volume76
Issue number5
DOIs
StatePublished - May 2005

Fingerprint

Genes
Guidelines
Nonparametric Statistics
Genetic Markers
Case-Control Studies
Prostatic Neoplasms
Alleles
Genotype
Direction compound

ASJC Scopus subject areas

  • Genetics

Cite this

Nonparametric tests of association of multiple genes with human disease. / Schaid, Daniel J; McDonnell, Shannon K.; Hebbring, Scott J.; Cunningham, Julie M; Thibodeau, Stephen N.

In: American Journal of Human Genetics, Vol. 76, No. 5, 05.2005, p. 780-793.

Research output: Contribution to journalArticle

@article{ea1bef6e00e64628a148c7ff2cd2181d,
title = "Nonparametric tests of association of multiple genes with human disease",
abstract = "The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on 17-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a {"}kernel{"} function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I - error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.",
author = "Schaid, {Daniel J} and McDonnell, {Shannon K.} and Hebbring, {Scott J.} and Cunningham, {Julie M} and Thibodeau, {Stephen N}",
year = "2005",
month = "5",
doi = "10.1086/429838",
language = "English (US)",
volume = "76",
pages = "780--793",
journal = "American Journal of Human Genetics",
issn = "0002-9297",
publisher = "Cell Press",
number = "5",

}

TY - JOUR

T1 - Nonparametric tests of association of multiple genes with human disease

AU - Schaid, Daniel J

AU - McDonnell, Shannon K.

AU - Hebbring, Scott J.

AU - Cunningham, Julie M

AU - Thibodeau, Stephen N

PY - 2005/5

Y1 - 2005/5

N2 - The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on 17-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a "kernel" function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I - error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.

AB - The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on 17-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a "kernel" function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I - error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.

UR - http://www.scopus.com/inward/record.url?scp=17644378739&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=17644378739&partnerID=8YFLogxK

U2 - 10.1086/429838

DO - 10.1086/429838

M3 - Article

C2 - 15786018

AN - SCOPUS:17644378739

VL - 76

SP - 780

EP - 793

JO - American Journal of Human Genetics

JF - American Journal of Human Genetics

SN - 0002-9297

IS - 5

ER -