Quality control of RNA-seq experiments

Xing Li, Asha Nair, Shengqin Wang, Liguo Wang

Research output: Contribution to journalArticle

17 Scopus citations

Abstract

Direct sequencing of the complementary DNA (cDNA) using high-throughput sequencing technologies (RNA-seq) is widely used and allows for more comprehensive understanding of the transcriptome than microarray. In theory, RNA-seq should be able to precisely identify and quantify all RNA species, small or large, at low or high abundance. However, RNA-seq is a complicated, multistep process involving reverse transcription, amplification, fragmentation, purification, adaptor ligation, and sequencing. Improper operations at any of these steps could make biased or even unusable data. Additionally, RNA-seq intrinsic biases (such as GC bias and nucleotide composition bias) and transcriptome complexity can also make data imperfect. Therefore, comprehensive quality assessment is the first and most critical step for all downstream analyses and results interpretation. This chapter discusses the most widely used quality control metrics including sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, PCR bias, GC bias, rRNA and mitochondria contamination, coverage uniformity, etc.

Original languageEnglish (US)
Pages (from-to)137-146
Number of pages10
JournalMethods in Molecular Biology
Volume1269
DOIs
StatePublished - Jan 1 2015

    Fingerprint

Keywords

  • High-throughput sequencing
  • Next-generation sequencing
  • Quality control
  • RNA-seq

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Cite this