LongAGE: Defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads

Quang Tran, Alexej Abyzov

Research output: Contribution to journalArticlepeer-review

Abstract

Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation - LongAGE - based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations.

Original languageEnglish (US)
Pages (from-to)1015-1017
Number of pages3
JournalBioinformatics
Volume37
Issue number7
DOIs
StatePublished - Apr 1 2021

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'LongAGE: Defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads'. Together they form a unique fingerprint.

Cite this