TY - GEN
T1 - Unraveling complex local genomic rearrangements from long-read data
AU - Stephens, Zachary D.
AU - Iyer, Ravishankar K.
AU - Wang, Chen
AU - Kocher, Jean Pierre A.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/15
Y1 - 2017/12/15
N2 - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.
AB - In this paper, we present a graph search approach for identifying arbitrarily complex structural genomic variation. Our method leverages the ability of long reads (e.g. from Pacific Biosciences platforms) to span multiple breakpoints of complicated local rearrangements, allowing us to resolve small-scale complexities that may be overlooked by other tools. We applied our method to a subset of NA12878 germline events using two long read datasets and demonstrate, with a concordance rate of 88.4% between the two sets, an increased ability to denote complex events over baseline calls from short read data. In a majority of the regions analyzed we detected small complexities that flank the breakpoints of larger events, including small insertions, inversions, and duplicated sequences. These patterns of complexity match known mechanisms associated with DNA replication and structural variant formation, and showcase the ability of our approach to efficiently unravel such events. Our method automatically classifies complex structural variant calls as a combination of nested or adjacent reference transformations, allowing users to identify specific structure types of interest. Additionally, an output report is generated for each event with interactive visual representations of the rearrangement.
UR - http://www.scopus.com/inward/record.url?scp=85046277710&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046277710&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2017.8217647
DO - 10.1109/BIBM.2017.8217647
M3 - Conference contribution
AN - SCOPUS:85046277710
T3 - Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
SP - 181
EP - 187
BT - Proceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
A2 - Yoo, Illhoi
A2 - Zheng, Jane Huiru
A2 - Gong, Yang
A2 - Hu, Xiaohua Tony
A2 - Shyu, Chi-Ren
A2 - Bromberg, Yana
A2 - Gao, Jean
A2 - Korkin, Dmitry
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
Y2 - 13 November 2017 through 16 November 2017
ER -