A comparative study of discriminating human heart failure etiology using gene expression profiles

Xiaohong Huang, Wei Pan, Suzanne Grindle, Xinqiang Han, Yingjie Chen, Soon J. Park, Leslie W. Miller, Jennifer Hall

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

Original languageEnglish (US)
Article number205
JournalBMC Bioinformatics
Volume6
DOIs
StatePublished - Aug 24 2005
Externally publishedYes

Fingerprint

Heart Failure
Gene Expression Profile
Transcriptome
Gene expression
Comparative Study
Random Forest
Partial Least Squares
Statistical method
Therapy
Least-Squares Analysis
Ventricular Function
Penalized Least Squares
Statistical methods
Multi-class Classification
Environmental Factors
Prognosis
Gene Expression
Centroid
Gene Expression Data
Discriminant

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

A comparative study of discriminating human heart failure etiology using gene expression profiles. / Huang, Xiaohong; Pan, Wei; Grindle, Suzanne; Han, Xinqiang; Chen, Yingjie; Park, Soon J.; Miller, Leslie W.; Hall, Jennifer.

In: BMC Bioinformatics, Vol. 6, 205, 24.08.2005.

Research output: Contribution to journalArticle

Huang, X, Pan, W, Grindle, S, Han, X, Chen, Y, Park, SJ, Miller, LW & Hall, J 2005, 'A comparative study of discriminating human heart failure etiology using gene expression profiles', BMC Bioinformatics, vol. 6, 205. https://doi.org/10.1186/1471-2105-6-205
Huang, Xiaohong ; Pan, Wei ; Grindle, Suzanne ; Han, Xinqiang ; Chen, Yingjie ; Park, Soon J. ; Miller, Leslie W. ; Hall, Jennifer. / A comparative study of discriminating human heart failure etiology using gene expression profiles. In: BMC Bioinformatics. 2005 ; Vol. 6.
@article{22fb7216e4d64062952c1867fa95440a,
title = "A comparative study of discriminating human heart failure etiology using gene expression profiles",
abstract = "Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.",
author = "Xiaohong Huang and Wei Pan and Suzanne Grindle and Xinqiang Han and Yingjie Chen and Park, {Soon J.} and Miller, {Leslie W.} and Jennifer Hall",
year = "2005",
month = "8",
day = "24",
doi = "10.1186/1471-2105-6-205",
language = "English (US)",
volume = "6",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - A comparative study of discriminating human heart failure etiology using gene expression profiles

AU - Huang, Xiaohong

AU - Pan, Wei

AU - Grindle, Suzanne

AU - Han, Xinqiang

AU - Chen, Yingjie

AU - Park, Soon J.

AU - Miller, Leslie W.

AU - Hall, Jennifer

PY - 2005/8/24

Y1 - 2005/8/24

N2 - Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

AB - Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

UR - http://www.scopus.com/inward/record.url?scp=25444467981&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=25444467981&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-6-205

DO - 10.1186/1471-2105-6-205

M3 - Article

VL - 6

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 205

ER -