Abstract
Structural variations (SVs) in genomic DNA can have profound effects on the evolution of living organisms, on phenotypic variations and on disease processes. A critical step in discovering the full extent of structural variations is the development of tools to characterize these variations accurately in next generation sequencing data. Toward this goal, we developed a software pipeline named digit that implements a novel measure of mapping ambiguity to discover interchromosomal SVs from matepair and pair-end sequencing data. The workflow robustly handles the high numbers of artifacts present in mate-pair sequencing and reduces the false positive rate while maintaining sensitivity. In the simulated data set, our workflow recovered 96% of simulated SVs. It generates a self-updating library of common translocations and allows for the investigation of patient-or group-specific events, making it suitable for discovering and cataloging chromosomal translocations associated with specific groups, traits, diseases or population structures.
Original language | English (US) |
---|---|
Pages (from-to) | e72 |
Journal | Nucleic acids research |
Volume | 45 |
Issue number | 9 |
DOIs | |
State | Published - May 19 2017 |
ASJC Scopus subject areas
- Genetics