An omnibus test for differential distribution analysis of microbiome sequencing data

Jun Chen; Emily King; Rebecca Deek; Zhi Wei; Yue Yu; DIane Grill; Karla Ballman

doi:10.1093/bioinformatics/btx650

An omnibus test for differential distribution analysis of microbiome sequencing data

Jun Chen, Emily King, Rebecca Deek, Zhi Wei, Yue Yu, DIane Grill, Karla Ballman

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

22 Scopus citations

Abstract

Motivation One objective of human microbiome studies is to identify differentially abundant microbes across biological conditions. Previous statistical methods focus on detecting the shift in the abundance and/or prevalence of the microbes and treat the dispersion (spread of the data) as a nuisance. These methods also assume that the dispersion is the same across conditions, an assumption which may not hold in presence of sample heterogeneity. Moreover, the widespread outliers in the microbiome sequencing data make existing parametric models not overly robust. Therefore, a robust and powerful method that allows covariate-dependent dispersion and addresses outliers is still needed for differential abundance analysis. Results We introduce a novel test for differential distribution analysis of microbiome sequencing data by jointly testing the abundance, prevalence and dispersion. The test is built on a zero-inflated negative binomial regression model and winsorized count data to account for zero-inflation and outliers. Using simulated data and real microbiome sequencing datasets, we show that our test is robust across various biological conditions and overall more powerful than previous methods. Availability and implementation R package is available at https://github.com/jchen1981/MicrobiomeDDA. Contact chen.jun2@mayo.edu or zhiwei@njit.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

Original language	English (US)
Pages (from-to)	643-651
Number of pages	9
Journal	Bioinformatics
Volume	34
Issue number	4
DOIs	https://doi.org/10.1093/bioinformatics/btx650
State	Published - Feb 15 2018

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/btx650

Cite this

@article{e2eda346ed1f4224858f4bd28b573a5b,

title = "An omnibus test for differential distribution analysis of microbiome sequencing data",

abstract = "Motivation One objective of human microbiome studies is to identify differentially abundant microbes across biological conditions. Previous statistical methods focus on detecting the shift in the abundance and/or prevalence of the microbes and treat the dispersion (spread of the data) as a nuisance. These methods also assume that the dispersion is the same across conditions, an assumption which may not hold in presence of sample heterogeneity. Moreover, the widespread outliers in the microbiome sequencing data make existing parametric models not overly robust. Therefore, a robust and powerful method that allows covariate-dependent dispersion and addresses outliers is still needed for differential abundance analysis. Results We introduce a novel test for differential distribution analysis of microbiome sequencing data by jointly testing the abundance, prevalence and dispersion. The test is built on a zero-inflated negative binomial regression model and winsorized count data to account for zero-inflation and outliers. Using simulated data and real microbiome sequencing datasets, we show that our test is robust across various biological conditions and overall more powerful than previous methods. Availability and implementation R package is available at https://github.com/jchen1981/MicrobiomeDDA. Contact chen.jun2@mayo.edu or zhiwei@njit.edu Supplementary informationSupplementary dataare available at Bioinformatics online.",

author = "Jun Chen and Emily King and Rebecca Deek and Zhi Wei and Yue Yu and DIane Grill and Karla Ballman",

year = "2018",

month = feb,

day = "15",

doi = "10.1093/bioinformatics/btx650",

language = "English (US)",

volume = "34",

pages = "643--651",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "4",

}

TY - JOUR

T1 - An omnibus test for differential distribution analysis of microbiome sequencing data

AU - Chen, Jun

AU - King, Emily

AU - Deek, Rebecca

AU - Wei, Zhi

AU - Yu, Yue

AU - Grill, DIane

AU - Ballman, Karla

PY - 2018/2/15

Y1 - 2018/2/15

N2 - Motivation One objective of human microbiome studies is to identify differentially abundant microbes across biological conditions. Previous statistical methods focus on detecting the shift in the abundance and/or prevalence of the microbes and treat the dispersion (spread of the data) as a nuisance. These methods also assume that the dispersion is the same across conditions, an assumption which may not hold in presence of sample heterogeneity. Moreover, the widespread outliers in the microbiome sequencing data make existing parametric models not overly robust. Therefore, a robust and powerful method that allows covariate-dependent dispersion and addresses outliers is still needed for differential abundance analysis. Results We introduce a novel test for differential distribution analysis of microbiome sequencing data by jointly testing the abundance, prevalence and dispersion. The test is built on a zero-inflated negative binomial regression model and winsorized count data to account for zero-inflation and outliers. Using simulated data and real microbiome sequencing datasets, we show that our test is robust across various biological conditions and overall more powerful than previous methods. Availability and implementation R package is available at https://github.com/jchen1981/MicrobiomeDDA. Contact chen.jun2@mayo.edu or zhiwei@njit.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

AB - Motivation One objective of human microbiome studies is to identify differentially abundant microbes across biological conditions. Previous statistical methods focus on detecting the shift in the abundance and/or prevalence of the microbes and treat the dispersion (spread of the data) as a nuisance. These methods also assume that the dispersion is the same across conditions, an assumption which may not hold in presence of sample heterogeneity. Moreover, the widespread outliers in the microbiome sequencing data make existing parametric models not overly robust. Therefore, a robust and powerful method that allows covariate-dependent dispersion and addresses outliers is still needed for differential abundance analysis. Results We introduce a novel test for differential distribution analysis of microbiome sequencing data by jointly testing the abundance, prevalence and dispersion. The test is built on a zero-inflated negative binomial regression model and winsorized count data to account for zero-inflation and outliers. Using simulated data and real microbiome sequencing datasets, we show that our test is robust across various biological conditions and overall more powerful than previous methods. Availability and implementation R package is available at https://github.com/jchen1981/MicrobiomeDDA. Contact chen.jun2@mayo.edu or zhiwei@njit.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

UR - http://www.scopus.com/inward/record.url?scp=85042540885&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042540885&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btx650

DO - 10.1093/bioinformatics/btx650

M3 - Article

C2 - 29040451

AN - SCOPUS:85042540885

SN - 1367-4803

VL - 34

SP - 643

EP - 651

JO - Bioinformatics

JF - Bioinformatics

IS - 4

ER -

An omnibus test for differential distribution analysis of microbiome sequencing data

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this