Fast implementation of a scan statistic for identifying chromosomal patterns of genome wide association studies

Yan V. Sun; Douglas M. Jacobsen; Stephen T. Turner; Eric Boerwinkle; Sharon L.R. Kardia

doi:10.1016/j.csda.2008.04.013

Fast implementation of a scan statistic for identifying chromosomal patterns of genome wide association studies

Yan V. Sun, Douglas M. Jacobsen, Stephen T. Turner, Eric Boerwinkle, Sharon L.R. Kardia

Nephrology and Hypertension

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N = 195) using the Affymetrix Human Mapping 100 K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R² < 0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 h. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.

Original language	English (US)
Pages (from-to)	1794-1801
Number of pages	8
Journal	Computational Statistics and Data Analysis
Volume	53
Issue number	5
DOIs	https://doi.org/10.1016/j.csda.2008.04.013
State	Published - Mar 15 2009

ASJC Scopus subject areas

Statistics and Probability
Computational Mathematics
Computational Theory and Mathematics
Applied Mathematics

Access to Document

10.1016/j.csda.2008.04.013

Cite this

@article{6516f11584be487fba5056376aa9f3e0,

title = "Fast implementation of a scan statistic for identifying chromosomal patterns of genome wide association studies",

abstract = "In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N = 195) using the Affymetrix Human Mapping 100 K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R2 < 0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 h. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.",

author = "Sun, {Yan V.} and Jacobsen, {Douglas M.} and Turner, {Stephen T.} and Eric Boerwinkle and Kardia, {Sharon L.R.}",

note = "Funding Information: This work was supported by National Institute of Health grant HL087660, HL68737, HL 74735 and HL 53335. ",

year = "2009",

month = mar,

day = "15",

doi = "10.1016/j.csda.2008.04.013",

language = "English (US)",

volume = "53",

pages = "1794--1801",

journal = "Computational Statistics and Data Analysis",

issn = "0167-9473",

publisher = "Elsevier",

number = "5",

}

TY - JOUR

T1 - Fast implementation of a scan statistic for identifying chromosomal patterns of genome wide association studies

AU - Sun, Yan V.

AU - Jacobsen, Douglas M.

AU - Turner, Stephen T.

AU - Boerwinkle, Eric

AU - Kardia, Sharon L.R.

N1 - Funding Information: This work was supported by National Institute of Health grant HL087660, HL68737, HL 74735 and HL 53335.

PY - 2009/3/15

Y1 - 2009/3/15

N2 - In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N = 195) using the Affymetrix Human Mapping 100 K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R2 < 0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 h. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.

AB - In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N = 195) using the Affymetrix Human Mapping 100 K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R2 < 0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 h. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.

UR - http://www.scopus.com/inward/record.url?scp=60349123180&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60349123180&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2008.04.013

DO - 10.1016/j.csda.2008.04.013

M3 - Article

AN - SCOPUS:60349123180

SN - 0167-9473

VL - 53

SP - 1794

EP - 1801

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

IS - 5

ER -

Fast implementation of a scan statistic for identifying chromosomal patterns of genome wide association studies

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this