Measure transcript integrity using RNA-seq data

Liguo Wang, Jinfu Nie, Hugues Sicotte, Ying Li, Jeanette E Eckel-Passow, Surendra Dasari, Peter T. Vedell, Poulami Barman, Liewei M Wang, Richard Weinshiboum, Jin Jen, Haojie Huang, Manish Kohli, Jean-Pierre Kocher

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Background: Stored biological samples with pathology information and medical records are invaluable resources for translational medical research. However, RNAs extracted from the archived clinical tissues are often substantially degraded. RNA degradation distorts the RNA-seq read coverage in a gene-specific manner, and has profound influences on whole-genome gene expression profiling. Result: We developed the transcript integrity number (TIN) to measure RNA degradation. When applied to 3 independent RNA-seq datasets, we demonstrated TIN is a reliable and sensitive measure of the RNA degradation at both transcript and sample level. Through comparing 10 prostate cancer clinical samples with lower RNA integrity to 10 samples with higher RNA quality, we demonstrated that calibrating gene expression counts with TIN scores could effectively neutralize RNA degradation effects by reducing false positives and recovering biologically meaningful pathways. When further evaluating the performance of TIN correction using spike-in transcripts in RNA-seq data generated from the Sequencing Quality Control consortium, we found TIN adjustment had better control of false positives and false negatives (sensitivity = 0.89, specificity = 0.91, accuracy = 0.90), as compared to gene expression analysis results without TIN correction (sensitivity = 0.98, specificity = 0.50, accuracy = 0.86). Conclusion: TIN is a reliable measurement of RNA integrity and a valuable approach used to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.

Original languageEnglish (US)
JournalBMC Bioinformatics
Volume17
Issue number1
DOIs
StatePublished - Feb 3 2016

Fingerprint

RNA
Integrity
RNA Stability
Degradation
Gene expression
Gene Expression Analysis
Gene Expression
False Positive
Specificity
Sensitivity and Specificity
Translational Medical Research
Gene Expression Profiling
Quality Control
Prostate Cancer
Genes
Medical Records
Differential Expression
Prostatic Neoplasms
Profiling
Spike

Keywords

  • Gene expression
  • RNA-seq quality control
  • TIN
  • Transcript integrity number

ASJC Scopus subject areas

  • Applied Mathematics
  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Measure transcript integrity using RNA-seq data. / Wang, Liguo; Nie, Jinfu; Sicotte, Hugues; Li, Ying; Eckel-Passow, Jeanette E; Dasari, Surendra; Vedell, Peter T.; Barman, Poulami; Wang, Liewei M; Weinshiboum, Richard; Jen, Jin; Huang, Haojie; Kohli, Manish; Kocher, Jean-Pierre.

In: BMC Bioinformatics, Vol. 17, No. 1, 03.02.2016.

Research output: Contribution to journalArticle

Wang, Liguo ; Nie, Jinfu ; Sicotte, Hugues ; Li, Ying ; Eckel-Passow, Jeanette E ; Dasari, Surendra ; Vedell, Peter T. ; Barman, Poulami ; Wang, Liewei M ; Weinshiboum, Richard ; Jen, Jin ; Huang, Haojie ; Kohli, Manish ; Kocher, Jean-Pierre. / Measure transcript integrity using RNA-seq data. In: BMC Bioinformatics. 2016 ; Vol. 17, No. 1.
@article{032b1acddec942edab22e5c7e75c1663,
title = "Measure transcript integrity using RNA-seq data",
abstract = "Background: Stored biological samples with pathology information and medical records are invaluable resources for translational medical research. However, RNAs extracted from the archived clinical tissues are often substantially degraded. RNA degradation distorts the RNA-seq read coverage in a gene-specific manner, and has profound influences on whole-genome gene expression profiling. Result: We developed the transcript integrity number (TIN) to measure RNA degradation. When applied to 3 independent RNA-seq datasets, we demonstrated TIN is a reliable and sensitive measure of the RNA degradation at both transcript and sample level. Through comparing 10 prostate cancer clinical samples with lower RNA integrity to 10 samples with higher RNA quality, we demonstrated that calibrating gene expression counts with TIN scores could effectively neutralize RNA degradation effects by reducing false positives and recovering biologically meaningful pathways. When further evaluating the performance of TIN correction using spike-in transcripts in RNA-seq data generated from the Sequencing Quality Control consortium, we found TIN adjustment had better control of false positives and false negatives (sensitivity = 0.89, specificity = 0.91, accuracy = 0.90), as compared to gene expression analysis results without TIN correction (sensitivity = 0.98, specificity = 0.50, accuracy = 0.86). Conclusion: TIN is a reliable measurement of RNA integrity and a valuable approach used to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.",
keywords = "Gene expression, RNA-seq quality control, TIN, Transcript integrity number",
author = "Liguo Wang and Jinfu Nie and Hugues Sicotte and Ying Li and Eckel-Passow, {Jeanette E} and Surendra Dasari and Vedell, {Peter T.} and Poulami Barman and Wang, {Liewei M} and Richard Weinshiboum and Jin Jen and Haojie Huang and Manish Kohli and Jean-Pierre Kocher",
year = "2016",
month = "2",
day = "3",
doi = "10.1186/s12859-016-0922-z",
language = "English (US)",
volume = "17",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Measure transcript integrity using RNA-seq data

AU - Wang, Liguo

AU - Nie, Jinfu

AU - Sicotte, Hugues

AU - Li, Ying

AU - Eckel-Passow, Jeanette E

AU - Dasari, Surendra

AU - Vedell, Peter T.

AU - Barman, Poulami

AU - Wang, Liewei M

AU - Weinshiboum, Richard

AU - Jen, Jin

AU - Huang, Haojie

AU - Kohli, Manish

AU - Kocher, Jean-Pierre

PY - 2016/2/3

Y1 - 2016/2/3

N2 - Background: Stored biological samples with pathology information and medical records are invaluable resources for translational medical research. However, RNAs extracted from the archived clinical tissues are often substantially degraded. RNA degradation distorts the RNA-seq read coverage in a gene-specific manner, and has profound influences on whole-genome gene expression profiling. Result: We developed the transcript integrity number (TIN) to measure RNA degradation. When applied to 3 independent RNA-seq datasets, we demonstrated TIN is a reliable and sensitive measure of the RNA degradation at both transcript and sample level. Through comparing 10 prostate cancer clinical samples with lower RNA integrity to 10 samples with higher RNA quality, we demonstrated that calibrating gene expression counts with TIN scores could effectively neutralize RNA degradation effects by reducing false positives and recovering biologically meaningful pathways. When further evaluating the performance of TIN correction using spike-in transcripts in RNA-seq data generated from the Sequencing Quality Control consortium, we found TIN adjustment had better control of false positives and false negatives (sensitivity = 0.89, specificity = 0.91, accuracy = 0.90), as compared to gene expression analysis results without TIN correction (sensitivity = 0.98, specificity = 0.50, accuracy = 0.86). Conclusion: TIN is a reliable measurement of RNA integrity and a valuable approach used to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.

AB - Background: Stored biological samples with pathology information and medical records are invaluable resources for translational medical research. However, RNAs extracted from the archived clinical tissues are often substantially degraded. RNA degradation distorts the RNA-seq read coverage in a gene-specific manner, and has profound influences on whole-genome gene expression profiling. Result: We developed the transcript integrity number (TIN) to measure RNA degradation. When applied to 3 independent RNA-seq datasets, we demonstrated TIN is a reliable and sensitive measure of the RNA degradation at both transcript and sample level. Through comparing 10 prostate cancer clinical samples with lower RNA integrity to 10 samples with higher RNA quality, we demonstrated that calibrating gene expression counts with TIN scores could effectively neutralize RNA degradation effects by reducing false positives and recovering biologically meaningful pathways. When further evaluating the performance of TIN correction using spike-in transcripts in RNA-seq data generated from the Sequencing Quality Control consortium, we found TIN adjustment had better control of false positives and false negatives (sensitivity = 0.89, specificity = 0.91, accuracy = 0.90), as compared to gene expression analysis results without TIN correction (sensitivity = 0.98, specificity = 0.50, accuracy = 0.86). Conclusion: TIN is a reliable measurement of RNA integrity and a valuable approach used to neutralize in vitro RNA degradation effect and improve differential gene expression analysis.

KW - Gene expression

KW - RNA-seq quality control

KW - TIN

KW - Transcript integrity number

UR - http://www.scopus.com/inward/record.url?scp=84957034217&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84957034217&partnerID=8YFLogxK

U2 - 10.1186/s12859-016-0922-z

DO - 10.1186/s12859-016-0922-z

M3 - Article

VL - 17

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 1

ER -