Unraveling complex local genomic rearrangements from long-read data

Zachary D. Stephens, Ravishankar K. Iyer, Chen Wang, Jean-Pierre Kocher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages181-187
Number of pages7
Volume2017-January
ISBN (Electronic)9781509030491
DOIs
StatePublished - Dec 15 2017
Event2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017 - Kansas City, United States
Duration: Nov 13 2017Nov 16 2017

Other

Other2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
CountryUnited States
CityKansas City
Period11/13/1711/16/17

Fingerprint

DNA
Genomic Structural Variation
Sequence Inversion
DNA Replication
Datasets

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Cite this

Stephens, Z. D., Iyer, R. K., Wang, C., & Kocher, J-P. (2017). Unraveling complex local genomic rearrangements from long-read data. In Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017 (Vol. 2017-January, pp. 181-187). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BIBM.2017.8217647

Unraveling complex local genomic rearrangements from long-read data. / Stephens, Zachary D.; Iyer, Ravishankar K.; Wang, Chen; Kocher, Jean-Pierre.

Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. Vol. 2017-January Institute of Electrical and Electronics Engineers Inc., 2017. p. 181-187.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Stephens, ZD, Iyer, RK, Wang, C & Kocher, J-P 2017, Unraveling complex local genomic rearrangements from long-read data. in Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. vol. 2017-January, Institute of Electrical and Electronics Engineers Inc., pp. 181-187, 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, United States, 11/13/17. https://doi.org/10.1109/BIBM.2017.8217647
Stephens ZD, Iyer RK, Wang C, Kocher J-P. Unraveling complex local genomic rearrangements from long-read data. In Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. Vol. 2017-January. Institute of Electrical and Electronics Engineers Inc. 2017. p. 181-187 https://doi.org/10.1109/BIBM.2017.8217647
Stephens, Zachary D. ; Iyer, Ravishankar K. ; Wang, Chen ; Kocher, Jean-Pierre. / Unraveling complex local genomic rearrangements from long-read data. Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017. Vol. 2017-January Institute of Electrical and Electronics Engineers Inc., 2017. pp. 181-187
@inproceedings{aa6206849b5a4bd9b2a68fb5b4e7e497,
title = "Unraveling complex local genomic rearrangements from long-read data",
abstract = "In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4{\%} between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.",
author = "Stephens, {Zachary D.} and Iyer, {Ravishankar K.} and Chen Wang and Jean-Pierre Kocher",
year = "2017",
month = "12",
day = "15",
doi = "10.1109/BIBM.2017.8217647",
language = "English (US)",
volume = "2017-January",
pages = "181--187",
booktitle = "Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Unraveling complex local genomic rearrangements from long-read data

AU - Stephens, Zachary D.

AU - Iyer, Ravishankar K.

AU - Wang, Chen

AU - Kocher, Jean-Pierre

PY - 2017/12/15

Y1 - 2017/12/15

N2 - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

AB - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.

UR - http://www.scopus.com/inward/record.url?scp=85046277710&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046277710&partnerID=8YFLogxK

U2 - 10.1109/BIBM.2017.8217647

DO - 10.1109/BIBM.2017.8217647

M3 - Conference contribution

AN - SCOPUS:85046277710

VL - 2017-January

SP - 181

EP - 187

BT - Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -