Softsearch

Integration of multiple sequence features to identify breakpoints of structural variations

Steven Hart, Vivekananda Sarangi, Raymond Moore, Saurabh Baheti, Jaysheel D. Bhavsar, Fergus J Couch, Jean-Pierre Kocher

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Background: Structural variation (SV) represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

Original languageEnglish (US)
Article numbere83356
JournalPLoS One
Volume8
Issue number12
DOIs
StatePublished - Dec 16 2013

Fingerprint

BRCA2 Gene
Exome
Workflow
DNA Sequence Analysis
sequence analysis
Genes
Genome
Technology
genome
genes
DNA
Experiments
Datasets

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Softsearch : Integration of multiple sequence features to identify breakpoints of structural variations. / Hart, Steven; Sarangi, Vivekananda; Moore, Raymond; Baheti, Saurabh; Bhavsar, Jaysheel D.; Couch, Fergus J; Kocher, Jean-Pierre.

In: PLoS One, Vol. 8, No. 12, e83356, 16.12.2013.

Research output: Contribution to journalArticle

Hart, Steven ; Sarangi, Vivekananda ; Moore, Raymond ; Baheti, Saurabh ; Bhavsar, Jaysheel D. ; Couch, Fergus J ; Kocher, Jean-Pierre. / Softsearch : Integration of multiple sequence features to identify breakpoints of structural variations. In: PLoS One. 2013 ; Vol. 8, No. 12.
@article{d94a85e7d68e4067b63213ae380b9725,
title = "Softsearch: Integration of multiple sequence features to identify breakpoints of structural variations",
abstract = "Background: Structural variation (SV) represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.",
author = "Steven Hart and Vivekananda Sarangi and Raymond Moore and Saurabh Baheti and Bhavsar, {Jaysheel D.} and Couch, {Fergus J} and Jean-Pierre Kocher",
year = "2013",
month = "12",
day = "16",
doi = "10.1371/journal.pone.0083356",
language = "English (US)",
volume = "8",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - Softsearch

T2 - Integration of multiple sequence features to identify breakpoints of structural variations

AU - Hart, Steven

AU - Sarangi, Vivekananda

AU - Moore, Raymond

AU - Baheti, Saurabh

AU - Bhavsar, Jaysheel D.

AU - Couch, Fergus J

AU - Kocher, Jean-Pierre

PY - 2013/12/16

Y1 - 2013/12/16

N2 - Background: Structural variation (SV) represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

AB - Background: Structural variation (SV) represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

UR - http://www.scopus.com/inward/record.url?scp=84892656321&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892656321&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0083356

DO - 10.1371/journal.pone.0083356

M3 - Article

VL - 8

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 12

M1 - e83356

ER -