RFtest: A Robust and Flexible Community-Level Test for Microbiome Data Powerfully Detects Phylogenetically Clustered Signals

Lujun Zhang, Yanshan Wang, Jingwen Chen, Jun Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Random forest is considered as one of the most successful machine learning algorithms, which has been widely used to construct microbiome-based predictive models. However, its use as a statistical testing method has not been explored. In this study, we propose “Random Forest Test” (RFtest), a global (community-level) test based on random forest for high-dimensional and phylogenetically structured microbiome data. RFtest is a permutation test using the generalization error of random forest as the test statistic. Our simulations demonstrate that RFtest has controlled type I error rates, that its power is superior to competing methods for phylogenetically clustered signals, and that it is robust to outliers and adaptive to interaction effects and non-linear associations. Finally, we apply RFtest to two real microbiome datasets to ascertain whether microbial communities are associated or not with the outcome variables.

Original languageEnglish (US)
Article number749573
JournalFrontiers in Genetics
Volume12
DOIs
StatePublished - Jan 24 2022

Keywords

  • community-wide test
  • hypothesis testing
  • microbiome
  • omics association test
  • random forest

ASJC Scopus subject areas

  • Molecular Medicine
  • Genetics
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'RFtest: A Robust and Flexible Community-Level Test for Microbiome Data Powerfully Detects Phylogenetically Clustered Signals'. Together they form a unique fingerprint.

Cite this