Quality control of RNA-seq experiments

Xing Li, Asha Nair, Shengqin Wang, Liguo Wang

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Direct sequencing of the complementary DNA (cDNA) using high-throughput sequencing technologies (RNA-seq) is widely used and allows for more comprehensive understanding of the transcriptome than microarray. In theory, RNA-seq should be able to precisely identify and quantify all RNA species, small or large, at low or high abundance. However, RNA-seq is a complicated, multistep process involving reverse transcription, amplification, fragmentation, purification, adaptor ligation, and sequencing. Improper operations at any of these steps could make biased or even unusable data. Additionally, RNA-seq intrinsic biases (such as GC bias and nucleotide composition bias) and transcriptome complexity can also make data imperfect. Therefore, comprehensive quality assessment is the first and most critical step for all downstream analyses and results interpretation. This chapter discusses the most widely used quality control metrics including sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity, etc.

Original languageEnglish (US)
Title of host publicationRNA Bioinformatics
PublisherSpringer New York
Pages137-146
Number of pages10
ISBN (Electronic)9781493922918
ISBN (Print)9781493922901
DOIs
StatePublished - Jan 10 2015

Fingerprint

Quality Control
Quality control
RNA
Transcriptome
Nucleotides
Experiments
High-Throughput Nucleotide Sequencing
Mitochondria
Reverse Transcription
Ligation
Transcription
Microarrays
Chemical analysis
Complementary DNA
Purification
Amplification
Technology
Contamination
Polymerase Chain Reaction
Throughput

Keywords

  • High-throughput sequencing
  • Next-generation sequencing
  • Quality control
  • RNA-seq

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Li, X., Nair, A., Wang, S., & Wang, L. (2015). Quality control of RNA-seq experiments. In RNA Bioinformatics (pp. 137-146). Springer New York. https://doi.org/10.1007/978-1-4939-2291-8_8

Quality control of RNA-seq experiments. / Li, Xing; Nair, Asha; Wang, Shengqin; Wang, Liguo.

RNA Bioinformatics. Springer New York, 2015. p. 137-146.

Research output: Chapter in Book/Report/Conference proceedingChapter

Li, X, Nair, A, Wang, S & Wang, L 2015, Quality control of RNA-seq experiments. in RNA Bioinformatics. Springer New York, pp. 137-146. https://doi.org/10.1007/978-1-4939-2291-8_8
Li X, Nair A, Wang S, Wang L. Quality control of RNA-seq experiments. In RNA Bioinformatics. Springer New York. 2015. p. 137-146 https://doi.org/10.1007/978-1-4939-2291-8_8
Li, Xing ; Nair, Asha ; Wang, Shengqin ; Wang, Liguo. / Quality control of RNA-seq experiments. RNA Bioinformatics. Springer New York, 2015. pp. 137-146
@inbook{37921a075e3642eb9107daf2cba0fce9,
title = "Quality control of RNA-seq experiments",
abstract = "Direct sequencing of the complementary DNA (cDNA) using high-throughput sequencing technologies (RNA-seq) is widely used and allows for more comprehensive understanding of the transcriptome than microarray. In theory, RNA-seq should be able to precisely identify and quantify all RNA species, small or large, at low or high abundance. However, RNA-seq is a complicated, multistep process involving reverse transcription, amplification, fragmentation, purification, adaptor ligation, and sequencing. Improper operations at any of these steps could make biased or even unusable data. Additionally, RNA-seq intrinsic biases (such as GC bias and nucleotide composition bias) and transcriptome complexity can also make data imperfect. Therefore, comprehensive quality assessment is the first and most critical step for all downstream analyses and results interpretation. This chapter discusses the most widely used quality control metrics including sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity, etc.",
keywords = "High-throughput sequencing, Next-generation sequencing, Quality control, RNA-seq",
author = "Xing Li and Asha Nair and Shengqin Wang and Liguo Wang",
year = "2015",
month = "1",
day = "10",
doi = "10.1007/978-1-4939-2291-8_8",
language = "English (US)",
isbn = "9781493922901",
pages = "137--146",
booktitle = "RNA Bioinformatics",
publisher = "Springer New York",

}

TY - CHAP

T1 - Quality control of RNA-seq experiments

AU - Li, Xing

AU - Nair, Asha

AU - Wang, Shengqin

AU - Wang, Liguo

PY - 2015/1/10

Y1 - 2015/1/10

N2 - Direct sequencing of the complementary DNA (cDNA) using high-throughput sequencing technologies (RNA-seq) is widely used and allows for more comprehensive understanding of the transcriptome than microarray. In theory, RNA-seq should be able to precisely identify and quantify all RNA species, small or large, at low or high abundance. However, RNA-seq is a complicated, multistep process involving reverse transcription, amplification, fragmentation, purification, adaptor ligation, and sequencing. Improper operations at any of these steps could make biased or even unusable data. Additionally, RNA-seq intrinsic biases (such as GC bias and nucleotide composition bias) and transcriptome complexity can also make data imperfect. Therefore, comprehensive quality assessment is the first and most critical step for all downstream analyses and results interpretation. This chapter discusses the most widely used quality control metrics including sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity, etc.

AB - Direct sequencing of the complementary DNA (cDNA) using high-throughput sequencing technologies (RNA-seq) is widely used and allows for more comprehensive understanding of the transcriptome than microarray. In theory, RNA-seq should be able to precisely identify and quantify all RNA species, small or large, at low or high abundance. However, RNA-seq is a complicated, multistep process involving reverse transcription, amplification, fragmentation, purification, adaptor ligation, and sequencing. Improper operations at any of these steps could make biased or even unusable data. Additionally, RNA-seq intrinsic biases (such as GC bias and nucleotide composition bias) and transcriptome complexity can also make data imperfect. Therefore, comprehensive quality assessment is the first and most critical step for all downstream analyses and results interpretation. This chapter discusses the most widely used quality control metrics including sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity, etc.

KW - High-throughput sequencing

KW - Next-generation sequencing

KW - Quality control

KW - RNA-seq

UR - http://www.scopus.com/inward/record.url?scp=84956707342&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84956707342&partnerID=8YFLogxK

U2 - 10.1007/978-1-4939-2291-8_8

DO - 10.1007/978-1-4939-2291-8_8

M3 - Chapter

SN - 9781493922901

SP - 137

EP - 146

BT - RNA Bioinformatics

PB - Springer New York

ER -