A fast and accurate SNP detection algorithm for next-generation sequencing data

Feng Xu, Weixin Wang, Panwen Wang, Mulin Jun Li, Pak Chung Sham, Junwen Wang

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Various methods have been developed for calling single-nucleotide polymorphisms from next-generation sequencing data. However, for satisfactory performance, most of these methods require expensive high-depth sequencing. Here, we propose a fast and accurate single-nucleotide polymorphism detection program that uses a binomial distribution-based algorithm and a mutation probability. We extensively assess this program on normal and cancer next-generation sequencing data from The Cancer Genome Atlas project and pooled data from the 1,000 Genomes Project. We also compare the performance of several state-of-the-art programs for single-nucleotide polymorphism calling and evaluate their pros and cons. We demonstrate that our program is a fast and highly accurate single-nucleotide polymorphism detection method, particularly when the sequence depth is low. The program can finish single-nucleotide polymorphism calling within four hours for 10-fold human genome next-generation sequencing data (30 gigabases) on a standard desktop computer.

Original languageEnglish (US)
Article number1258
JournalNature Communications
Volume3
DOIs
StatePublished - 2012
Externally publishedYes

Fingerprint

sequencing
polymorphism
nucleotides
Polymorphism
Single Nucleotide Polymorphism
Nucleotides
genome
Genes
Assess program
cancer
Binomial Distribution
Genome
Atlases
Human Genome
mutations
Personal computers
Neoplasms
Mutation

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Chemistry(all)
  • Physics and Astronomy(all)

Cite this

A fast and accurate SNP detection algorithm for next-generation sequencing data. / Xu, Feng; Wang, Weixin; Wang, Panwen; Jun Li, Mulin; Chung Sham, Pak; Wang, Junwen.

In: Nature Communications, Vol. 3, 1258, 2012.

Research output: Contribution to journalArticle

Xu, Feng ; Wang, Weixin ; Wang, Panwen ; Jun Li, Mulin ; Chung Sham, Pak ; Wang, Junwen. / A fast and accurate SNP detection algorithm for next-generation sequencing data. In: Nature Communications. 2012 ; Vol. 3.
@article{547b2c4dd26948d7be4f895a1a5ce02d,
title = "A fast and accurate SNP detection algorithm for next-generation sequencing data",
abstract = "Various methods have been developed for calling single-nucleotide polymorphisms from next-generation sequencing data. However, for satisfactory performance, most of these methods require expensive high-depth sequencing. Here, we propose a fast and accurate single-nucleotide polymorphism detection program that uses a binomial distribution-based algorithm and a mutation probability. We extensively assess this program on normal and cancer next-generation sequencing data from The Cancer Genome Atlas project and pooled data from the 1,000 Genomes Project. We also compare the performance of several state-of-the-art programs for single-nucleotide polymorphism calling and evaluate their pros and cons. We demonstrate that our program is a fast and highly accurate single-nucleotide polymorphism detection method, particularly when the sequence depth is low. The program can finish single-nucleotide polymorphism calling within four hours for 10-fold human genome next-generation sequencing data (30 gigabases) on a standard desktop computer.",
author = "Feng Xu and Weixin Wang and Panwen Wang and {Jun Li}, Mulin and {Chung Sham}, Pak and Junwen Wang",
year = "2012",
doi = "10.1038/ncomms2256",
language = "English (US)",
volume = "3",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - A fast and accurate SNP detection algorithm for next-generation sequencing data

AU - Xu, Feng

AU - Wang, Weixin

AU - Wang, Panwen

AU - Jun Li, Mulin

AU - Chung Sham, Pak

AU - Wang, Junwen

PY - 2012

Y1 - 2012

N2 - Various methods have been developed for calling single-nucleotide polymorphisms from next-generation sequencing data. However, for satisfactory performance, most of these methods require expensive high-depth sequencing. Here, we propose a fast and accurate single-nucleotide polymorphism detection program that uses a binomial distribution-based algorithm and a mutation probability. We extensively assess this program on normal and cancer next-generation sequencing data from The Cancer Genome Atlas project and pooled data from the 1,000 Genomes Project. We also compare the performance of several state-of-the-art programs for single-nucleotide polymorphism calling and evaluate their pros and cons. We demonstrate that our program is a fast and highly accurate single-nucleotide polymorphism detection method, particularly when the sequence depth is low. The program can finish single-nucleotide polymorphism calling within four hours for 10-fold human genome next-generation sequencing data (30 gigabases) on a standard desktop computer.

AB - Various methods have been developed for calling single-nucleotide polymorphisms from next-generation sequencing data. However, for satisfactory performance, most of these methods require expensive high-depth sequencing. Here, we propose a fast and accurate single-nucleotide polymorphism detection program that uses a binomial distribution-based algorithm and a mutation probability. We extensively assess this program on normal and cancer next-generation sequencing data from The Cancer Genome Atlas project and pooled data from the 1,000 Genomes Project. We also compare the performance of several state-of-the-art programs for single-nucleotide polymorphism calling and evaluate their pros and cons. We demonstrate that our program is a fast and highly accurate single-nucleotide polymorphism detection method, particularly when the sequence depth is low. The program can finish single-nucleotide polymorphism calling within four hours for 10-fold human genome next-generation sequencing data (30 gigabases) on a standard desktop computer.

UR - http://www.scopus.com/inward/record.url?scp=84871775490&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871775490&partnerID=8YFLogxK

U2 - 10.1038/ncomms2256

DO - 10.1038/ncomms2256

M3 - Article

VL - 3

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

M1 - 1258

ER -