False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing

Jian Xiao, Hongyuan Cao, Jun Chen

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

Motivation: Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. Results: We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. Availability and implementation: R package StructFDR is available from CRAN.

Original languageEnglish (US)
Pages (from-to)2873-2881
Number of pages9
JournalBioinformatics
Volume33
Issue number18
DOIs
StatePublished - Sep 15 2017

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing'. Together they form a unique fingerprint.

Cite this