TY - JOUR
T1 - Elevated variant density around SV breakpoints in germline lineage lends support to error-prone replication hypothesis
AU - Dhokarh, Dhananjay
AU - Abyzov, Alexej
N1 - Funding Information:
We thank Pallavi Chhabra for her immense help with artwork for the figures and Aditya Bhagwate and Taejeong Bae for helpful discussions about various tools. We acknowledge support from the Center for Individualized Medicine at Mayo Clinic.
Publisher Copyright:
© 2016 Dhokarh and Abyzov.
PY - 2016/7
Y1 - 2016/7
N2 - Copy number variants (CNVs) are a class of structural variants that may involve complex genomic rearrangements (CGRs) and are hypothesized to have additional mutations around their breakpoints. Understanding the mechanisms underlying CNV formation is fundamental for understanding the repair and mutation mechanisms in cells, thereby shedding light on evolution, genomic disorders, cancer, and complex human traits. In this study, we used data from the 1000 Genomes Project to analyze hundreds of loci harboring heterozygous germline deletions in the subjects NA12878 and NA19240. By utilizing synthetic long-read data (longer than 2 kbp) in combination with high coverage short-read data and, in parallel, by comparing with parental genomes, we interrogated the phasing of these deletions with the flanking tens of thousands of heterozygous SNPs and indels. We found that the density of SNPs/indels flanking the breakpoints of deletions (in-phase variants) is approximately twice as high as the corresponding density for the variants on the haplotype without deletion (out-of-phase variants). This fold change was even larger for the subset of deletions with signatures of replication-based mechanism of formation. The allele frequency (AF) spectrum for deletions is enriched for rare events; and the AF spectrum for in-phase SNPs is shifted toward this deletion spectrum, thus offering evidence consistent with the concomitance of the in-phase SNPs/indels with the deletion events. These findings therefore lend support to the hypothesis that the mutational mechanisms underlying CNV formation are error prone. Our results could also be relevant for resolving mutation-rate discrepancies in human and to explain kataegis.
AB - Copy number variants (CNVs) are a class of structural variants that may involve complex genomic rearrangements (CGRs) and are hypothesized to have additional mutations around their breakpoints. Understanding the mechanisms underlying CNV formation is fundamental for understanding the repair and mutation mechanisms in cells, thereby shedding light on evolution, genomic disorders, cancer, and complex human traits. In this study, we used data from the 1000 Genomes Project to analyze hundreds of loci harboring heterozygous germline deletions in the subjects NA12878 and NA19240. By utilizing synthetic long-read data (longer than 2 kbp) in combination with high coverage short-read data and, in parallel, by comparing with parental genomes, we interrogated the phasing of these deletions with the flanking tens of thousands of heterozygous SNPs and indels. We found that the density of SNPs/indels flanking the breakpoints of deletions (in-phase variants) is approximately twice as high as the corresponding density for the variants on the haplotype without deletion (out-of-phase variants). This fold change was even larger for the subset of deletions with signatures of replication-based mechanism of formation. The allele frequency (AF) spectrum for deletions is enriched for rare events; and the AF spectrum for in-phase SNPs is shifted toward this deletion spectrum, thus offering evidence consistent with the concomitance of the in-phase SNPs/indels with the deletion events. These findings therefore lend support to the hypothesis that the mutational mechanisms underlying CNV formation are error prone. Our results could also be relevant for resolving mutation-rate discrepancies in human and to explain kataegis.
UR - http://www.scopus.com/inward/record.url?scp=84976874887&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84976874887&partnerID=8YFLogxK
U2 - 10.1101/gr.205484.116
DO - 10.1101/gr.205484.116
M3 - Article
C2 - 27216746
AN - SCOPUS:84976874887
SN - 1088-9051
VL - 26
SP - 874
EP - 881
JO - Genome Research
JF - Genome Research
IS - 7
ER -