2
Comprehensive Peptide Searching Work ow to Maximize Protein Identi cations
Overview
Purpose:
Development of a comprehensive protein
identification workflow to maximize high-confidence
peptide/protein identifications including post-translational
modifications (PTM) compared to a traditional database search
strategy.
Methods
: Use of a combination of multiple search engines
(e.g., SEQUEST®, Sequest HT, Mascot and MS Amanda)
where combinations of PTMs were judiciously chosen for each
node based on uniprotKB relative PTM abundances from high
quality, manually curated, proteome-wide data
1
.
Results
: Tremendous enhancement in the high-confidence,
Percolator-validated peptide and protein identifications
compared to a standard protein identification workflow
.
Introduction
Protein identification and characterization by mass
spectrometry has become an established method in biological
research in recent years. The number of protein identifications
from complex biological samples depends on many factors,
ranging from data acquisition strategy to MS/MS data
searching methods. Unfortunately, only a fraction of spectra
generated by the acquisition have confident peptide matches
for any complex biological sample. There are several factors
that are being overlooked by many users in the conventional
data searching strategy, including the appropriate combination
of PTMs, coding SNPs
2
, isoforms of proteins, and iterative
searching strategies that can potentially help to identify
unmatched spectra. We developed a comprehensive MS/MS
searching
workflow in Thermo Scientific™ Proteome
Discoverer™ software to maximize high
-confidence
peptide/protein identifications. The effect of various search
strategy factors on peptide identifications were explored. We
implemented a process that includes analysis of protein
isoforms, missed cleavage sites, semi-tryptic digestion and
most importantly, appropriate combination of PTMs in each
search node. The workflows were tested on plasma and urine
samples analyzed on a
Thermo Scientific™ Orbitrap™ hybrid
mass spectrometer. The comprehensive workflow was found to
make more high-confidence peptide/protein IDs and identify
multiple PTMs and partially cleaved peptides in a single run.
Methods
Comprehensive Workflow Development
We developed a comprehensive MS/MS searching workflow in
Proteome Discoverer software using a combination of multiple
search engines (Figure 1) in an iterative fashion to maximize
protein/peptide identifications by considering the most
frequently found PTMs1, artefacts (Table 1) and partially
cleaved peptides. The combination of PTMs were judiciously
chosen based on relative abundances (UniProtKB) of each
PTM found experimentally and putatively as described in, from
TABLE 1. Paramete
comprehensive sea
Search
Engine
Precursor
Mass
Tolerance
Fr
Tol
(Q
M
O
Vel
Mascot
5 ppm
0.
0
SEQUEST
5 ppm
0.
0
SEQUEST
5 ppm
0.
0
SEQUEST
5 ppm
0.
0
SEQUEST
5 ppm
0.
0
Sequest
HT
5 ppm
0.
MS
Amenda
5 ppm
0.
Results
We compared the re
strategy with a stand
average, the number
(FDR≤0.01) increase
comprehensive work
whereas the increme
peptide identification
compared to standar
The comprehensive
number of high-
confi
by 90% and the high
respect to the standa
comprehensive work
proteins (with at leas
protein in the group)
The comprehensive
peptides with multipl
particular combinatio
FIGURE 2. Compre