Summary: Mate pair library sequencing is an effective and economical method for detecting genomic structural variants and chromosomal abnormalities. Unfortunately, the mapping and alignment of mate-pair read pairs to a reference genome is a challenging and time-consuming process for most next-generation sequencing alignment programs. Large insert sizes, introduction of library preparation protocol artifacts (biotin junction reads, paired-end read contamination, chimeras, etc.) and presence of structural variant breakpoints within reads increase mapping and alignment complexity. We describe an algorithm that is up to 20 times faster and 25% more accurate than popular next-generation sequencing alignment programs when processing mate pair sequencing.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics