TY - JOUR
T1 - The eSNV-detect
T2 - A computational system to identify expressed single nucleotide variants from transcriptome sequencing data
AU - Tang, Xiaojia
AU - Baheti, Saurabh
AU - Shameer, Khader
AU - Thompson, Kevin J.
AU - Wills, Quin
AU - Niu, Nifang
AU - Holcomb, Ilona N.
AU - Boutet, Stephane C.
AU - Ramakrishnan, Ramesh
AU - Kachergus, Jennifer M.
AU - Kocher, Jean Pierre A.
AU - Weinshilboum, Richard M.
AU - Wang, Liewei
AU - Thompson, E. Aubrey
AU - Kalari, Krishna R.
N1 - Funding Information:
This work is supported by the Mayo Clinic Center for Individualized Medicine (CIM). K.R.K. is supported by Eveleigh family career Development award, and Mayo Clinic Breast Specialized Program of Research Excellence (SPORE). Additional support was also obtained from 26.2 with Donna Foundation, the NIH Pharmacogenomics Research Network (U19 GM61388) and Mayo Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Conflict of interest statement. None declared.
Publisher Copyright:
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2014/12/16
Y1 - 2014/12/16
N2 - Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6-96.8% precision and 91.6-95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.
AB - Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6-96.8% precision and 91.6-95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.
UR - http://www.scopus.com/inward/record.url?scp=84924312533&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84924312533&partnerID=8YFLogxK
U2 - 10.1093/nar/gku1005
DO - 10.1093/nar/gku1005
M3 - Article
C2 - 25352556
AN - SCOPUS:84924312533
SN - 0305-1048
VL - 42
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 22
M1 - e172
ER -