A comparative study of discriminating human heart failure etiology using gene expression profiles

Xiaohong Huang; Wei Pan; Suzanne Grindle; Xinqiang Han; Yingjie Chen; Soon J. Park; Leslie W. Miller; Jennifer Hall

doi:10.1186/1471-2105-6-205

A comparative study of discriminating human heart failure etiology using gene expression profiles

Xiaohong Huang, Wei Pan, Suzanne Grindle, Xinqiang Han, Yingjie Chen, Soon J. Park, Leslie W. Miller, Jennifer Hall

Cardiovascular Surgery

Research output: Contribution to journal › Article › peer-review

45 Scopus citations

Abstract

Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

Original language	English (US)
Article number	205
Journal	BMC bioinformatics
Volume	6
DOIs	https://doi.org/10.1186/1471-2105-6-205
State	Published - Aug 24 2005

ASJC Scopus subject areas

Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics

Access to Document

10.1186/1471-2105-6-205

Cite this

@article{22fb7216e4d64062952c1867fa95440a,

title = "A comparative study of discriminating human heart failure etiology using gene expression profiles",

abstract = "Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.",

author = "Xiaohong Huang and Wei Pan and Suzanne Grindle and Xinqiang Han and Yingjie Chen and Park, {Soon J.} and Miller, {Leslie W.} and Jennifer Hall",

year = "2005",

month = aug,

day = "24",

doi = "10.1186/1471-2105-6-205",

language = "English (US)",

volume = "6",

journal = "BMC bioinformatics",

issn = "1471-2105",

publisher = "BioMed Central",

}

TY - JOUR

T1 - A comparative study of discriminating human heart failure etiology using gene expression profiles

AU - Huang, Xiaohong

AU - Pan, Wei

AU - Grindle, Suzanne

AU - Han, Xinqiang

AU - Chen, Yingjie

AU - Park, Soon J.

AU - Miller, Leslie W.

AU - Hall, Jennifer

PY - 2005/8/24

Y1 - 2005/8/24

N2 - Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

AB - Background: Human heart failure is a complex disease that manifests from multiple genetic and environmental factors. Although ischemic and non-ischemic heart disease present clinically with many similar decreases in ventricular function, emerging work suggests that they are distinct diseases with different responses to therapy. The ability to distinguish between ischemic and non-ischemic heart failure may be essential to guide appropriate therapy and determine prognosis for successful treatment. In this paper we consider discriminating the etiologies of heart failure using gene expression libraries from two separate institutions. Results: We apply five new statistical methods, including partial least squares, penalized partial least squares, LASSO, nearest shrunken centroids and random forest, to two real datasets and compare their performance for multiclass classification. It is found that the five statistical methods perform similarly on each of the two datasets: In a simulation study, it is confirmed that the five methods tend to have close performance, though the random forest seems to have a slight edge: it is difficult to correctly distinguish the etiologies of heart failure in one dataset whereas it is easy for the other one. Conclusions: For some gene expression data, several recently developed discriminant methods may perform similarly. More importantly, one must remain cautious when assessing the discriminating performance using gene expression profiles based on a small dataset; our analysis suggests the importance of utilizing multiple or larger datasets.

UR - http://www.scopus.com/inward/record.url?scp=25444467981&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=25444467981&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-6-205

DO - 10.1186/1471-2105-6-205

M3 - Article

C2 - 16120216

AN - SCOPUS:25444467981

SN - 1471-2105

VL - 6

JO - BMC bioinformatics

JF - BMC bioinformatics

M1 - 205

ER -

A comparative study of discriminating human heart failure etiology using gene expression profiles

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this