Technological improvements shifted sequencing from low-throughput, work-intensive, gel-based systems to high-throughput capillary systems. This resulted in a broad use of genomic resequencing to identify sequence variations in genes and regulatory, as well as extended genomic regions. We describe a software package, novoSNP, that conscientiously discovers single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs) in sequence trace files in a fast, reliable, and user-friendly way. We compared the performance of novoSNP with that of PolyPhred and PolyBayes on two data sets. The first data set comprised 1028 sequence trace files obtained from diagnostic mutation analyses of SCNIA (neuronal voltage-gated sodium channel α-subunit type I gene). The second data set comprised 9062 sequence trace flies from a genomic resequencing project aiming at the construction of a high-density SNP map of MAPT (microtubule-associated protein tau gene). Visual inspection of these data sets had identified 38 sequence variations for SCNIA and 488 for MAPT. novoSNP automatically identified all 38 SCNIA variations including five INDELs, while for MAPT only 15 of the 488 variations were not correctly marked. PolyPhred detected far fewer SNPs as compared to novoSNP and missed nearly all INDELs. PolyBayes, designed for the sequence analysis of cloned templates, detected only a limited number of the variations present in the data set. Besides the significant improvement in the automated detection of sequence variations both in diagnostic mutation analyses and in SNP discovery projects, novoSNP also offers a user-friendly interface for inspecting possible genetic variations.
ASJC Scopus subject areas