Predictive modeling of microbiome data using a phylogeny-regularized generalized linear mixed model

Jian Xiao, Li Chen, Stephen Johnson, Yue Yu, Xianyang Zhang, Jun Chen

Research output: Contribution to journalArticle

3 Scopus citations

Abstract

Recent human microbiome studies have revealed an essential role of the human microbiome in health and disease, opening up the possibility of building microbiome-based predictive models for individualized medicine. One unique characteristic of microbiome data is the existence of a phylogenetic tree that relates all the microbial species. It has frequently been observed that a cluster or clusters of bacteria at varying phylogenetic depths are associated with some clinical or biological outcome due to shared biological function (clustered signal). Moreover, in many cases, we observe a community-level change, where a large number of functionally interdependent species are associated with the outcome (dense signal). We thus develop "glmmTree," a prediction method based on a generalized linear mixed model framework, for capturing clustered and dense microbiome signals. glmmTree uses the similarity between microbiomes, which is defined based on the microbiome composition and the phylogenetic tree, to predict the outcome. The effects of other predictive variables (e.g., age, sex) can be incorporated readily in the regression framework. Additional tuning parameters enable a data-adaptive approach to capture signals at different phylogenetic depth and abundance level. Simulation studies and real data applications demonstrated that "glmmTree" outperformed existing methods in the dense and clustered signal scenarios.

Original languageEnglish (US)
Article number1391
JournalFrontiers in Microbiology
Volume9
Issue numberJUN
DOIs
StatePublished - Jun 27 2018

    Fingerprint

Keywords

  • Generalized mixed model
  • Kernel method
  • Microbiome
  • Phylogenetic tree
  • Predictive model

ASJC Scopus subject areas

  • Microbiology
  • Microbiology (medical)

Cite this