Power and sample size for testing associations of haplotypes with complex traits

Daniel J. Schaid

doi:10.1111/j.1529-8817.2005.00215.x

Power and sample size for testing associations of haplotypes with complex traits

Daniel J. Schaid

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

23 Scopus citations

Abstract

Evaluation of the association of haplotypes with either quantitative traits or disease status is common practice, and under some situations provides greater power than the evaluation of individual marker loci. The focus on haplotype analyses will increase as more single nucleotide polymorphisms (SNPs) are discovered, either because of interest in candidate gene regions, or because of interest in genome-wide association studies. However, there is little guidance on the determination of the sample size needed to achieve the desired power for a study, particularly when linkage phase of the haplotypes is unknown, and when a subset of tag-SNP markers is measured. There is a growing wealth of information on the distribution of haplotypes in different populations, and it is not unusual for investigators to measure genetic markers in pilot studies in order to gain knowledge of the distribution of haplotypes in the target population. Starting with this basic information on the distribution of haplotypes, we derive analytic methods to determine sample size or power to test the association of haplotypes with either a quantitative trait or disease status (e.g., a case-control study design), assuming that all subjects are unrelated. Our derivations cover both phase-known and phase-unknown haplotypes, allowing evaluation of the loss of efficiency due to unknown phase. We also extend our methods to when a subset of tag-SNPs is chosen, allowing investigators to explore the impact of tag-SNPs on power. Simulations illustrate that the theoretical power predictions are quite accurate over a broad range of conditions. Our theoretical formulae should provide useful guidance when planning haplotype association studies.

Original language	English (US)
Pages (from-to)	116-130
Number of pages	15
Journal	Annals of Human Genetics
Volume	70
Issue number	1
DOIs	https://doi.org/10.1111/j.1529-8817.2005.00215.x
State	Published - Jan 2006

Keywords

Case-control
Diplotype
Genotype
Linkage phase
Non-centrality parameter
Quantitative trait
Single nucleotide polymorphism
Tag-SNP

ASJC Scopus subject areas

Genetics
Genetics(clinical)

Access to Document

10.1111/j.1529-8817.2005.00215.x

Cite this

@article{829910be252444e9b04a83b7c3595797,

title = "Power and sample size for testing associations of haplotypes with complex traits",

abstract = "Evaluation of the association of haplotypes with either quantitative traits or disease status is common practice, and under some situations provides greater power than the evaluation of individual marker loci. The focus on haplotype analyses will increase as more single nucleotide polymorphisms (SNPs) are discovered, either because of interest in candidate gene regions, or because of interest in genome-wide association studies. However, there is little guidance on the determination of the sample size needed to achieve the desired power for a study, particularly when linkage phase of the haplotypes is unknown, and when a subset of tag-SNP markers is measured. There is a growing wealth of information on the distribution of haplotypes in different populations, and it is not unusual for investigators to measure genetic markers in pilot studies in order to gain knowledge of the distribution of haplotypes in the target population. Starting with this basic information on the distribution of haplotypes, we derive analytic methods to determine sample size or power to test the association of haplotypes with either a quantitative trait or disease status (e.g., a case-control study design), assuming that all subjects are unrelated. Our derivations cover both phase-known and phase-unknown haplotypes, allowing evaluation of the loss of efficiency due to unknown phase. We also extend our methods to when a subset of tag-SNPs is chosen, allowing investigators to explore the impact of tag-SNPs on power. Simulations illustrate that the theoretical power predictions are quite accurate over a broad range of conditions. Our theoretical formulae should provide useful guidance when planning haplotype association studies.",

keywords = "Case-control, Diplotype, Genotype, Linkage phase, Non-centrality parameter, Quantitative trait, Single nucleotide polymorphism, Tag-SNP",

author = "Schaid, {Daniel J.}",

year = "2006",

month = jan,

doi = "10.1111/j.1529-8817.2005.00215.x",

language = "English (US)",

volume = "70",

pages = "116--130",

journal = "Annals of Human Genetics",

issn = "0003-4800",

publisher = "Wiley-Blackwell",

number = "1",

}

TY - JOUR

T1 - Power and sample size for testing associations of haplotypes with complex traits

AU - Schaid, Daniel J.

PY - 2006/1

Y1 - 2006/1

N2 - Evaluation of the association of haplotypes with either quantitative traits or disease status is common practice, and under some situations provides greater power than the evaluation of individual marker loci. The focus on haplotype analyses will increase as more single nucleotide polymorphisms (SNPs) are discovered, either because of interest in candidate gene regions, or because of interest in genome-wide association studies. However, there is little guidance on the determination of the sample size needed to achieve the desired power for a study, particularly when linkage phase of the haplotypes is unknown, and when a subset of tag-SNP markers is measured. There is a growing wealth of information on the distribution of haplotypes in different populations, and it is not unusual for investigators to measure genetic markers in pilot studies in order to gain knowledge of the distribution of haplotypes in the target population. Starting with this basic information on the distribution of haplotypes, we derive analytic methods to determine sample size or power to test the association of haplotypes with either a quantitative trait or disease status (e.g., a case-control study design), assuming that all subjects are unrelated. Our derivations cover both phase-known and phase-unknown haplotypes, allowing evaluation of the loss of efficiency due to unknown phase. We also extend our methods to when a subset of tag-SNPs is chosen, allowing investigators to explore the impact of tag-SNPs on power. Simulations illustrate that the theoretical power predictions are quite accurate over a broad range of conditions. Our theoretical formulae should provide useful guidance when planning haplotype association studies.

AB - Evaluation of the association of haplotypes with either quantitative traits or disease status is common practice, and under some situations provides greater power than the evaluation of individual marker loci. The focus on haplotype analyses will increase as more single nucleotide polymorphisms (SNPs) are discovered, either because of interest in candidate gene regions, or because of interest in genome-wide association studies. However, there is little guidance on the determination of the sample size needed to achieve the desired power for a study, particularly when linkage phase of the haplotypes is unknown, and when a subset of tag-SNP markers is measured. There is a growing wealth of information on the distribution of haplotypes in different populations, and it is not unusual for investigators to measure genetic markers in pilot studies in order to gain knowledge of the distribution of haplotypes in the target population. Starting with this basic information on the distribution of haplotypes, we derive analytic methods to determine sample size or power to test the association of haplotypes with either a quantitative trait or disease status (e.g., a case-control study design), assuming that all subjects are unrelated. Our derivations cover both phase-known and phase-unknown haplotypes, allowing evaluation of the loss of efficiency due to unknown phase. We also extend our methods to when a subset of tag-SNPs is chosen, allowing investigators to explore the impact of tag-SNPs on power. Simulations illustrate that the theoretical power predictions are quite accurate over a broad range of conditions. Our theoretical formulae should provide useful guidance when planning haplotype association studies.

KW - Case-control

KW - Diplotype

KW - Genotype

KW - Linkage phase

KW - Non-centrality parameter

KW - Quantitative trait

KW - Single nucleotide polymorphism

KW - Tag-SNP

UR - http://www.scopus.com/inward/record.url?scp=33644830588&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33644830588&partnerID=8YFLogxK

U2 - 10.1111/j.1529-8817.2005.00215.x

DO - 10.1111/j.1529-8817.2005.00215.x

M3 - Article

C2 - 16441261

AN - SCOPUS:33644830588

SN - 0003-4800

VL - 70

SP - 116

EP - 130

JO - Annals of Human Genetics

JF - Annals of Human Genetics

IS - 1

ER -

Power and sample size for testing associations of haplotypes with complex traits

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this