The rapid development of next-generation sequencing (NGS) technology provides a new chance to extend the scale and resolution of genomic research. How to efficiently map millions of short reads to the reference genome and how to make accurate variant call are two major challenges in NGS analysis. In this chapter, we review current software for aligning short reads and detecting single-nucleotide polymorphisms (SNPs) and extensively evaluate their performance on normal and cancer samples from the Cancer Genome Atlas project and trio’s data from the 1000 Genomes Project. We find that Burrows-Wheeler transform-based aligners are proven to be the most suitable for Illumina platform, and NovoalignCS shows the best overall performance for SOLiD data. We also demonstrate FaSD as the most reliable SNP caller compared with several state-of-the-art programs. Furthermore, NGS shows significantly lower coverage and poorer SNP-calling performance in the CpG island, promoter, and 5’UTR regions of the human genome. We show that both high GC-content and low repetitive elements are the causes of lower coverage in the promoter regions.
|Original language||English (US)|
|Title of host publication||Next Generation Sequencing in Cancer Research: Volume 1: Decoding the Cancer Genome|
|Publisher||Springer New York|
|Number of pages||17|
|State||Published - Jan 1 2013|
- Next-generation sequencing
ASJC Scopus subject areas