A weighted U-Statistic for genetic association analyses of sequencing data

Changshuai Wei, Ming Li, Zihuai He, Olga Vsevolozhskaya, Daniel J. Schaid, Qing Lu

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol.

Original languageEnglish (US)
Pages (from-to)699-708
Number of pages10
JournalGenetic epidemiology
Volume38
Issue number8
DOIs
StatePublished - Dec 1 2014

Keywords

  • Next-generation sequencing
  • Rare variants
  • Weighted U-statistic

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'A weighted U-Statistic for genetic association analyses of sequencing data'. Together they form a unique fingerprint.

Cite this