Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

Runpu Chen; Le Yang; Steve Goodison; Yijun Sun

doi:10.1093/bioinformatics/btz769

Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

Runpu Chen, Le Yang, Steve Goodison, Yijun Sun

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

13 Scopus citations

Abstract

Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.

Original language	English (US)
Pages (from-to)	1476-1483
Number of pages	8
Journal	Bioinformatics
Volume	36
Issue number	5
DOIs	https://doi.org/10.1093/bioinformatics/btz769
State	Published - Mar 1 2020

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/btz769

Cite this

@article{f692fdb2009349cab423f56d13ffa848,

title = "Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data",

abstract = "Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.",

author = "Runpu Chen and Le Yang and Steve Goodison and Yijun Sun",

note = "Publisher Copyright: {\textcopyright} 2019 The Author(s). Published by Oxford University Press.",

year = "2020",

month = mar,

day = "1",

doi = "10.1093/bioinformatics/btz769",

language = "English (US)",

volume = "36",

pages = "1476--1483",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "5",

}

TY - JOUR

T1 - Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

AU - Chen, Runpu

AU - Yang, Le

AU - Goodison, Steve

AU - Sun, Yijun

PY - 2020/3/1

Y1 - 2020/3/1

N2 - Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.

AB - Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.

UR - http://www.scopus.com/inward/record.url?scp=85081727833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85081727833&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btz769

DO - 10.1093/bioinformatics/btz769

M3 - Article

C2 - 31603461

AN - SCOPUS:85081727833

SN - 1367-4803

VL - 36

SP - 1476

EP - 1483

JO - Bioinformatics

JF - Bioinformatics

IS - 5

ER -

Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this