A partially linear tree-based regression model for assessing complex joint gene-gene and gene-environment effects

Jinbo Chen, Kai Yu, Ann Hsing, Terry M. Therneau

Research output: Contribution to journalArticlepeer-review

22 Scopus citations


The success of genetic dissection of complex diseases may greatly benefit from judicious exploration of joint gene effects, which, in turn, critically depends on the power of statistical tools. Standard regression models are convenient for assessing main effects and low-order gene-gene interactions but not for exploring complex higher-order interactions. Tree-based methodology is an attractive alternative for disentangling possible interactions, but it has difficulty in modeling additive main effects. This work proposes a new class of semiparametric regression models, termed partially linear tree-based regression (PLTR) models, which exhibit the advantages of both generalized linear regression and tree models. A PLTR model quantifies joint effects of genes and other risk factors by a combination of linear main effects and a non-parametric tree -structure. We propose an iterative algorithm to fit the PLTR model, and a unified resampling approach for identifying and testing the significance of the optimal "pruned" tree nested within the tree resultant from the fitting algorithm. Simulation studies showed that the resampling procedure maintained the correct type I error rate. We applied the PLTR model to assess the association between biliary stone risk and 53 single nucleotide polymorphisms (SNPs) in the inflammation pathway in a population-based case-control study. The analysis yielded an interesting parsimonious summary of the joint effect of all SNPs. The proposed model is also useful for exploring gene-environment interactions and has broad implications for applying the tree methodology to genetic epidemiology research.

Original languageEnglish (US)
Pages (from-to)238-251
Number of pages14
JournalGenetic epidemiology
Issue number3
StatePublished - Apr 2007


  • Gene-environment interaction
  • Gene-gene interaction
  • Generalized linear model
  • Partially linear
  • Tree model

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)


Dive into the research topics of 'A partially linear tree-based regression model for assessing complex joint gene-gene and gene-environment effects'. Together they form a unique fingerprint.

Cite this