TY - GEN
T1 - Kernel Methods for Regression Analysis of Microbiome Compositional Data
AU - Chen, Jun
AU - Li, Hongzhe
N1 - Funding Information:
We thank Rick Bushman, James Lewis, and Gary Wu for sharing the data and for many helpful discussions. This research is supported by NIH grants CA127334 and GM097505.
PY - 2013
Y1 - 2013
N2 - With the development of next generation sequencing technologies, the human microbiome can now be studied using direct DNA sequencing. Many human diseases have been shown to be associated with the disorder of the human microbiome. Previous statistical methods for associating the microbiome composition with an outcome such as disease status focus on the association of the abundance of individual taxon or their abundance ratios with the outcome variable. However, the problem of multiple testing leads to loss of power to detect the association. When individual taxon-level association test fails, an overall test, which pools the individually weak association signal, can be applied to test the significance of the effect of the overall microbiome composition on an outcome variable. In this paper, we propose a kernel-based semi-parametric regression method for testing the significance of the effect of the microbiome composition on a continuous or binary outcome. Our method provides the flexibility to incorporate the phylogenetic information into the kernels as well as the ability to naturally adjust for the covariate effects. We evaluate our methods using simulations as well as a real data set on testing the significance of the human gut microbiome composition on body mass index (BMI) while adjusting for total fat intake. Our result suggests that the gut microbiome has a strong effect on BMI and this effect is independent of total fat intake.
AB - With the development of next generation sequencing technologies, the human microbiome can now be studied using direct DNA sequencing. Many human diseases have been shown to be associated with the disorder of the human microbiome. Previous statistical methods for associating the microbiome composition with an outcome such as disease status focus on the association of the abundance of individual taxon or their abundance ratios with the outcome variable. However, the problem of multiple testing leads to loss of power to detect the association. When individual taxon-level association test fails, an overall test, which pools the individually weak association signal, can be applied to test the significance of the effect of the overall microbiome composition on an outcome variable. In this paper, we propose a kernel-based semi-parametric regression method for testing the significance of the effect of the microbiome composition on a continuous or binary outcome. Our method provides the flexibility to incorporate the phylogenetic information into the kernels as well as the ability to naturally adjust for the covariate effects. We evaluate our methods using simulations as well as a real data set on testing the significance of the human gut microbiome composition on body mass index (BMI) while adjusting for total fat intake. Our result suggests that the gut microbiome has a strong effect on BMI and this effect is independent of total fat intake.
UR - http://www.scopus.com/inward/record.url?scp=84886029183&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84886029183&partnerID=8YFLogxK
U2 - 10.1007/978-1-4614-7846-1_16
DO - 10.1007/978-1-4614-7846-1_16
M3 - Conference contribution
AN - SCOPUS:84886029183
SN - 9781461478454
T3 - Springer Proceedings in Mathematics and Statistics
SP - 191
EP - 201
BT - Topics in Applied Statistics - 2012 Symposium of the International Chinese Statistical Association
T2 - 21st Symposium of the International Chinese Statistical Association, ICSA 2012
Y2 - 23 June 2012 through 26 June 2012
ER -