2
Spectrum Library Retention Time Prediction Based on Endogenous Peptide Standards
Overview
Purpose:
Accurately estimate peptide retention based on spectrum library data
utilizing commonly observed peptides in place of synthetic standards.
Methods:
We consolidate many monthsʼ worth of LC-MS/MS data into a library
of MS/MS spectra. Our automated analysis selects endogenous peptides to act
as standards which are used to predict retention times of any peptide in the
library.
Results:
Seventeen peptides were identified as appropriate endogenous
standards. Relative retention time information stored in the library allowed us to
predict the retention times of 1750 peptides more accurately than predictions
based on hydrophobicity.
Introduction
Spectrum libraries are an invaluable starting point for developing targeted
assays (e.g. SRM, PRM) because they provide information about
fragmentation patterns and retention times. When library data are collected
under a variety of LC conditions, the use of synthetic peptide standards can
greatly improve the ability to accurately predict retention time in new
experiments. Unfortunately, any samples not including those peptide
standards cannot be used in the predictions. We present a method for
selecting peptides endogenous to a sample to act as standards and
demonstrate their use for predicting retention times of other peptides including
those with chemical modifications, which indicate portability to both unmodified
and post-translationally modified peptides.
Methods
Sample Preparation
Activity-based protein profiling (ABPP) was performed on various human lung
cancer cells and five pairs of tumor and adjacent control human tissue samples.
Thermo Scientific™ Pierce™ ActivX ™ desthiobiotin ATP probes were used to
interact with ATP utilizing enzymes and lysine close to the active sites were
labeled with desthiobiotin.
Liquid Chromatography and Mass Spectrometry
Trypsin-digested samples were run on one of three gradients (2 hr on HPLC, 2
hr on UPLC, 4 hr on UPLC). The validation experiment used a 4 hr gradient on
UPLC. Spectra were acquired on a Thermo Scientific™ LTQ Orbitrap™ MS
using data-dependent acquisition.
Data Analysis
Peptide identification was done in Thermo Scientific™ Proteome Discoverer™
(PD) software. The spectrum library was built using the Crystal node for PD
version 1.4. A custom script was written to analyze the library entries and find
appropriate endogenous peptides to use as standards.
Results
Peptide Frequency in the Spectrum Library
Assembly of the Crystal spectrum library collected the retention time information
into one resource. The library contained 220,542 spectra from 250 LC-MS runs
including 9,109 peptide sequences (12,063 total with modified forms). As these
samples did not contain a synthetic peptide standard, we first sought appropriate
endogenous peptides.
The best candidates for peptides to act as retention time landmarks are those
most commonly seen from run to run. We looked at the frequency of peptides in
the 250 runs used to build the library. No peptides were observed in every run,
the most commonly seen peptide having 233 appearances. (Figure 1) We
selected the 50 most commonly seen peptides which were seen in no fewer
than 185 run .
Endogenous Pept
Starting with the 5
to find a set of pept
consistently eluted
automated the pro
250 runs, for each
before B. Next we
• Start the s
• For each r
appropriat
• If it cannot
peptides i
We found sevente
observed retention
FIGURE 2. A. Ret
observed retentio
landmarks were p
gradients are plot
in all runs, but th
gradient. Peptide
density in the earl
peptides in each r
there are enough
Landmark Peptides Obs
Number of Runs
Use Relative Retent
The Crystal library co
the library as a distan
A.
B.