A tool for RNA sequencing sample identity check

Jinyan Huang, Jun Chen, Mark Lathrop, Liming Liang

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems.

Original languageEnglish (US)
Pages (from-to)1463-1464
Number of pages2
JournalBioinformatics
Volume29
Issue number11
DOIs
StatePublished - Jun 1 2013
Externally publishedYes

Fingerprint

RNA Sequence Analysis
RNA
Sequencing
Quantitative Trait Loci
Contamination
Single Nucleotide Polymorphism
Gene expression
Genotype
Likelihood Functions
Gene Expression
DNA
DNA Sequence Analysis
Transcriptome
Resolve
Maximum likelihood
DNA Sequencing
Concordance
Maximum Likelihood Estimate
Likelihood
Evaluate

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Medicine(all)
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

A tool for RNA sequencing sample identity check. / Huang, Jinyan; Chen, Jun; Lathrop, Mark; Liang, Liming.

In: Bioinformatics, Vol. 29, No. 11, 01.06.2013, p. 1463-1464.

Research output: Contribution to journalArticle

Huang, Jinyan ; Chen, Jun ; Lathrop, Mark ; Liang, Liming. / A tool for RNA sequencing sample identity check. In: Bioinformatics. 2013 ; Vol. 29, No. 11. pp. 1463-1464.
@article{065f46a1aa7c4c19a1bf03e8a579d83b,
title = "A tool for RNA sequencing sample identity check",
abstract = "RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems.",
author = "Jinyan Huang and Jun Chen and Mark Lathrop and Liming Liang",
year = "2013",
month = "6",
day = "1",
doi = "10.1093/bioinformatics/btt155",
language = "English (US)",
volume = "29",
pages = "1463--1464",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - A tool for RNA sequencing sample identity check

AU - Huang, Jinyan

AU - Chen, Jun

AU - Lathrop, Mark

AU - Liang, Liming

PY - 2013/6/1

Y1 - 2013/6/1

N2 - RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems.

AB - RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems.

UR - http://www.scopus.com/inward/record.url?scp=84878266221&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878266221&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btt155

DO - 10.1093/bioinformatics/btt155

M3 - Article

VL - 29

SP - 1463

EP - 1464

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 11

ER -