A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions

Lu Yang, Jun Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. Results: We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. Conclusions: Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. [MediaObject not available: see fulltext.].

Original languageEnglish (US)
Article number130
JournalMicrobiome
Volume10
Issue number1
DOIs
StatePublished - Dec 2022

Keywords

  • Benchmarking
  • Compositional effects
  • Differential abundance analysis
  • False discovery rate
  • Metagenomics
  • Microbiome
  • Statistical methods
  • Zero inflation

ASJC Scopus subject areas

  • Microbiology
  • Microbiology (medical)

Fingerprint

Dive into the research topics of 'A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions'. Together they form a unique fingerprint.

Cite this