Pepitome: Evaluating improved spectral library search for identification complementarity and quality assessment

Surendra Dasari, Matthew C. Chambers, Misti A. Martinez, Kristin L. Carpenter, Amy Joan L Ham, Lorenzo J. Vega-Montoto, David L. Tabb

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines.

Original languageEnglish (US)
Pages (from-to)1686-1695
Number of pages10
JournalJournal of Proteome Research
Volume11
Issue number3
DOIs
StatePublished - Mar 2 2012
Externally publishedYes

Fingerprint

Libraries
Search Engine
Firearms
Search engines
Proteomics
Peptides
Protein Databases
Automation
Quality Control
Quality control
Mass spectrometry
Data acquisition
Mass Spectrometry
Identification (control systems)
Pipelines
Databases
RNA
Ions
Proteins

Keywords

  • dot products
  • hypergeometric distribution
  • quality control
  • spectral libraries bioinformatics

ASJC Scopus subject areas

  • Biochemistry
  • Chemistry(all)

Cite this

Dasari, S., Chambers, M. C., Martinez, M. A., Carpenter, K. L., Ham, A. J. L., Vega-Montoto, L. J., & Tabb, D. L. (2012). Pepitome: Evaluating improved spectral library search for identification complementarity and quality assessment. Journal of Proteome Research, 11(3), 1686-1695. https://doi.org/10.1021/pr200874e

Pepitome : Evaluating improved spectral library search for identification complementarity and quality assessment. / Dasari, Surendra; Chambers, Matthew C.; Martinez, Misti A.; Carpenter, Kristin L.; Ham, Amy Joan L; Vega-Montoto, Lorenzo J.; Tabb, David L.

In: Journal of Proteome Research, Vol. 11, No. 3, 02.03.2012, p. 1686-1695.

Research output: Contribution to journalArticle

Dasari, S, Chambers, MC, Martinez, MA, Carpenter, KL, Ham, AJL, Vega-Montoto, LJ & Tabb, DL 2012, 'Pepitome: Evaluating improved spectral library search for identification complementarity and quality assessment', Journal of Proteome Research, vol. 11, no. 3, pp. 1686-1695. https://doi.org/10.1021/pr200874e
Dasari, Surendra ; Chambers, Matthew C. ; Martinez, Misti A. ; Carpenter, Kristin L. ; Ham, Amy Joan L ; Vega-Montoto, Lorenzo J. ; Tabb, David L. / Pepitome : Evaluating improved spectral library search for identification complementarity and quality assessment. In: Journal of Proteome Research. 2012 ; Vol. 11, No. 3. pp. 1686-1695.
@article{86b1e2660c1548b78bf502a6c71c6d44,
title = "Pepitome: Evaluating improved spectral library search for identification complementarity and quality assessment",
abstract = "Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines.",
keywords = "dot products, hypergeometric distribution, quality control, spectral libraries bioinformatics",
author = "Surendra Dasari and Chambers, {Matthew C.} and Martinez, {Misti A.} and Carpenter, {Kristin L.} and Ham, {Amy Joan L} and Vega-Montoto, {Lorenzo J.} and Tabb, {David L.}",
year = "2012",
month = "3",
day = "2",
doi = "10.1021/pr200874e",
language = "English (US)",
volume = "11",
pages = "1686--1695",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "3",

}

TY - JOUR

T1 - Pepitome

T2 - Evaluating improved spectral library search for identification complementarity and quality assessment

AU - Dasari, Surendra

AU - Chambers, Matthew C.

AU - Martinez, Misti A.

AU - Carpenter, Kristin L.

AU - Ham, Amy Joan L

AU - Vega-Montoto, Lorenzo J.

AU - Tabb, David L.

PY - 2012/3/2

Y1 - 2012/3/2

N2 - Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines.

AB - Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines.

KW - dot products

KW - hypergeometric distribution

KW - quality control

KW - spectral libraries bioinformatics

UR - http://www.scopus.com/inward/record.url?scp=84857863377&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857863377&partnerID=8YFLogxK

U2 - 10.1021/pr200874e

DO - 10.1021/pr200874e

M3 - Article

C2 - 22217208

AN - SCOPUS:84857863377

VL - 11

SP - 1686

EP - 1695

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 3

ER -