Indel detection from RNA-seq data: Tool evaluation and strategies for accurate detection of actionable mutations

Zhifu D Sun, Aditya Bhagwate, Naresh Prodduturi, Ping Yang, Jean-Pierre Kocher

Research output: Contribution to journalArticle

14 Scopus citations

Abstract

Driver somatic mutations are a hallmark of a tumor that can be used for diagnosis and targeted therapy. Mutations are primarily detected from tumor DNA. As dynamic molecules of gene activities, transcriptome profiling by RNA sequence (RNA-seq) is becoming increasingly popular, which not only measures gene expression but also structural variations such as mutations and fusion transcripts. Although single-nucleotide variants (SNVs) can be easily identified from RNA-seq, intermediate long insertions/deletions (indels > 2 bases and less than sequence reads) cause significant challenges and are ignored by most RNA-seq analysis tools. This study evaluates commonly used RNA-seq analysis programs along with variant and somatic mutation callers in a series of data sets with simulated and known indels. The aim is to develop strategies for accurate indel detection. Our results show that the RNA-seq alignment is the most important step for indel identification and the evaluated programs have a wide range of sensitivity to map sequence reads with indels, from not at all to decently sensitive. The sensitivity is impacted by sequence read lengths. Most variant calling programs rely on hard evidence indels marked in the alignment and the programs with realignment may use soft-clipped reads for indel inferencing. Based on the observations, we have provided practical recommendations for indel detection when different RNA-seq aligners are used and demonstrated the best option with highly reliable results. With careful customization of bioinformatics algorithms, RNA-seq can be reliably used for both SNV and indel mutation detection that can be used for clinical decision-making.

Original languageEnglish (US)
Pages (from-to)973-983
Number of pages11
JournalBriefings in Bioinformatics
Volume18
Issue number6
DOIs
StatePublished - Nov 1 2017

    Fingerprint

Keywords

  • Alignment
  • EGFR
  • Indels
  • Mutation
  • RNA sequencing
  • Variant calling

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology

Cite this