A tool for RNA sequencing sample identity check

Jinyan Huang, Jun Chen, Mark Lathrop, Liming Liang

Research output: Contribution to journalArticle

5 Scopus citations

Abstract

RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems.

Original languageEnglish (US)
Pages (from-to)1463-1464
Number of pages2
JournalBioinformatics
Volume29
Issue number11
DOIs
StatePublished - Jun 1 2013

    Fingerprint

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this