0% found this document useful (0 votes)

99 views78 pages

Mass Spectrometry Peptide ID with MASCOT

l form of a sample of water at 400°C and 150 atm. describe the changes that occur as the sample in part (a) is slowly allowed to cool to −50°C at a constant pressure of 150 atm. Given: phase diagram, temperature, and pressure Asked for: physical form and physical changes Strategy: Identify the region of the phase diagram corresponding to the initial conditions and identify the phase that exists in this region. Draw a line corresponding to the given pressure. Move along that line in the approp

Uploaded by

Lovryan Tadena Amiling

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views78 pages

Mass Spectrometry Peptide ID with MASCOT

Uploaded by

Lovryan Tadena Amiling

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Mass Spectrometric Peptide

Identification Using
MASCOT
Dr. David Wishart
University of Alberta, Edmonton, Canada
[Link]@[Link]
MS Proteomics Applications
• Protein identification/confirmation
• Protein sample purity determination
• Detection of post-translational modifications
• Detection of amino acid substitutions
• Determination of disulfide bonds (# & status)
• De novo peptide sequencing
• Monitoring protein folding (H/D exchange)
• Monitoring protein-ligand complexes/struct.
• 3D Structure determination
Protein Identification
• 2D-GE + MALDI-MS
– Peptide Mass Fingerprinting (PMF)
• 2D-GE + MS-MS
– MS Peptide Sequencing/Fragment Ion Searching
• Multidimensional LC + MS-MS
– ICAT Methods (isotope labelling)
– MudPIT (Multidimensional Protein Ident. Tech.)
• 1D-GE + LC + MS-MS
• De Novo Peptide Sequencing
All require computers to process & analyze data
What is MASCOT?
• A (very) popular web-based tool from
Matrix Science ([Link]) for
performing rapid, accurate, on-line MS
analysis of peptides and proteins
• Supports 3 kinds of analyses
– Peptide Mass Fingerprinting (PMF)
– Sequence (tag) querying
– MS/MS Ion searches
Matrix Science Website

click
Mascot Home Page

[Link]
Why Mascot?
• Among the first to offer free web-based
services for both PMF and MS/MS
• First to use probability-based scoring
(PBS) or “Expect” values to rank matches
and hits (significant improvement over all
other scoring methods)
• Easy-to-use interface, fast, reliable, up-to-
date databases, accurate – a common
industry standard
Two Mascot Choices
• Matrix Science offers two choices for
users:
• #1) A free, open access web-based
system for occasional (1-10) queries
per day (this is what we’ll use)
• #2) A locally installed version for
heavy use or high throughput MS and
MS/MS labs (100’s of queries/day)
Local Mascot Server
• License cost is ~$4000 per CPU
• Single or dual processor Pentium 4,
Xeon, Athlon, Opteron chips (300 MHz
takes 200s/search, 3 GHz takes 20s)
• 2 Gbytes of RAM (key to performance)
• 120 Gbytes of Hard Disk (IDE) space
to store all desired databases
• Can run on Windows or Linux (same)
Local Mascot
• Allows you to customize your databases and
to customize the frequency of database
uploads
• Mascot Distiller – generates peak lists from
just about any instrument (converts
everything to a Mascot Generic File “GMF”)
• Mascot Daemon – allows you to do batch
searches “press submit and go home” also
allows monitoring of data flow on MS
instrument and autoprocessing of that data
Mascot Databases &
General Disk Needs
Example #1 Peptide Mass
Fingerprinting (PMF)
2D-GE + MALDI (PMF)

Trypsin
+ Gel punch

p53
Trx

G6PDH
Peptide Mass Fingerprinting
• Used to identify protein spots on gels or
protein peaks from an HPLC run
• Depends of the fact that if a peptide is cut up
or fragmented in a known way, the resulting
fragments (and resulting masses) are unique
enough to identify the protein
• Requires a database of known sequences
• Uses software to compare observed masses
with masses calculated from database
Principles of Fingerprinting
Sequence Mass (M+H) Tryptic Fragments
>Protein 1 acedfhsak
acedfhsakdfqea dfgeasdfpk
4842.05
sdfpkivtmeeewe ivtmeeewendadnfek
ndadnfekqwfe gwfe

>Protein 2 acek
acekdfhsadfqea dfhsadfgeasdfpk
4842.05
sdfpkivtmeeewe ivtmeeewenk
nkdadnfeqwfe dadnfeqwfe

>Protein 3 acedfhsadfgek
acedfhsadfqeka asdfpk
4842.05
sdfpkivtmeeewe ivtmeeewendak
ndakdnfeqwfe dnfegwfe
Principles of Fingerprinting
Sequence Mass (M+H) Mass Spectrum
>Protein 1
acedfhsakdfqea
4842.05
sdfpkivtmeeewe
ndadnfekqwfe

>Protein 2
acekdfhsadfqea
4842.05
sdfpkivtmeeewe
nkdadnfeqwfe

>Protein 3
acedfhsadfqeka
4842.05
sdfpkivtmeeewe
ndakdnfeqwfe
Protease Cleavage Rules

Trypsin XXX[KR]--[!P]XXX
Chymotrypsin XX[FYW]--[!P]XXX
Lys C XXXXXK-- XXXXX
Asp N endo XXXXXD-- XXXXX
CNBr XXXXXM--XXXXX
Why Trypsin?
• Robust, stable enzyme
• Works over a range of pH values & Temp.
• Quite specific and consistent in cleavage
• Cuts frequently to produce “ideal” MW peptides
• Inexpensive, easily available/purified
• Does produce “autolysis” peaks (which can be
used in MS calibrations)
– 1045.56, 1106.03, 1126.03, 1940.94, 2211.10, 2225.12,
2283.18, 2299.18
Calculating Peptide Masses
• Sum the monoisotopic residue masses
• Add mass of H2O (18.01056)
• Add mass of H+ (1.00785 to get M+H)
• If Met is oxidized add 15.99491
• If Cys has acrylamide adduct add 71.0371
• If Cys is iodoacetylated add 58.0071
• Other modifications are listed at
– [Link]
• Only consider peptides with masses > 400
Post-Translational
Modifications (PTM)
Masses in MS
• Monoisotopic
mass is the mass
determined using
the masses of the
most abundant
isotopes
• Average mass is
the abundance
weighted mass of
all isotopic
components
Amino Acid Residue Masses
Monoisotopic Mass
Glycine 57.02147 Aspartic acid 115.02695
Alanine 71.03712 Glutamine 128.05858
Serine 87.03203 Lysine 128.09497
Proline 97.05277 Glutamic acid 129.04264
Valine 99.06842 Methionine 131.04049
Threonine 101.04768 Histidine 137.05891
Cysteine 103.00919 Phenylalanine 147.06842
Isoleucine 113.08407 Arginine 156.10112
Leucine 113.08407 Tyrosine 163.06333
Asparagine 114.04293 Tryptophan 186.07932
Amino Acid Residue Masses
Average Mass
Glycine 57.0520 Aspartic acid 115.0886
Alanine 71.0788 Glutamine 128.1308
Serine 87.0782 Lysine 128.1742
Proline 97.1167 Glutamic acid 129.1155
Valine 99.1326 Methionine 131.1986
Threonine 101.1051 Histidine 137.1412
Cysteine 103.1448 Phenylalanine 147.1766
Isoleucine 113.1595 Arginine 156.1876
Leucine 113.1595 Tyrosine 163.1760
Asparagine 114.1039 Tryptophan 186.2133
Preparing a Peptide Mass
Fingerprint Database
• Take a protein sequence database (Swiss-
Prot or nr-GenBank)
• Determine cleavage sites and identify
resulting peptides for each protein entry
• Calculate the mass (M+H) for each peptide
• Sort the masses from lowest to highest
• Have a pointer for each calculated mass to
each protein accession number in databank
Building A PMF Database
Sequence DB Calc. Tryptic Frags Mass List
>P12345 acedfhsak 450.2017 (P21234)
acedfhsakdfqea dfgeasdfpk 609.2667 (P12345)
sdfpkivtmeeewe ivtmeeewendadnfek 664.3300 (P89212)
ndadnfekqwfe gwfe 1007.4251 (P12345)
1114.4416 (P89212)
>P21234 acek 1183.5266 (P12345)
acekdfhsadfqea dfhsadfgeasdfpk 1300.5116 (P21234)
sdfpkivtmeeewe ivtmeeewenk 1407.6462 (P21234)
nkdadnfeqwfe dadnfeqwfe 1526.6211 (P89212)
1593.7101 (P89212)
>P89212 acedfhsadfgek 1740.7501 (P21234)
acedfhsadfqeka asdfpk 2098.8909 (P12345)
sdfpkivtmeeewe ivtmeeewendak
ndakdnfeqwfe dnfegwfe
The Fingerprint (PMF) Algorithm
• Take a mass spectrum of a trypsin-
cleaved protein (from gel or HPLC peak)
• Identify as many masses as possible in
spectrum (avoid autolysis peaks)
• Compare query masses with database
masses and calculate # of matches or
matching score (based on length and
mass difference)
• Rank hits and return top scoring entry –
this is the protein of interest
Query (MALDI) Spectrum
1007
1199

2211 (trp)

609

2098
450
1940 (trp)
698

500 1000 1500 2000 2500

Query vs. Database
Query Masses Database Mass List Results
450.2201 450.2017 (P21234) 2 Unknown masses
609.3667 609.2667 (P12345) 1 hit on P21234
698.3100 664.3300 (P89212) 3 hits on P12345
1007.5391 1007.4251 (P12345)
1199.4916 1114.4416 (P89212) Conclude the query
2098.9909 1183.5266 (P12345) protein is P12345
1300.5116 (P21234)
1407.6462 (P21234)
1526.6211 (P89212)
1593.7101 (P89212)
1740.7501 (P21234)
2098.8909 (P12345)
What You Need To Do PMF
• A list of query masses (as many as possible)
• Protease(s) used or cleavage reagents
• Databases to search (SWProt, NR, Organism)
• Estimated mass and pI of protein spot (opt)
• Cysteine (or other) modifications
• Minimum number of hits for significance
• Mass tolerance (100 ppm = 1000.0 ± 0.1 Da)
• A PMF website (Prowl, ProFound, Mascot, etc.)
PMF on the Web
• Mascot
• [Link]
• ProFound
– [Link]
• MOWSE
• [Link]
• PeptideSearch
• [Link]
[Link]/GroupPages/[Link]
• PeptIdent
• [Link]
Mascot – PMF Query

click

[Link]
Exercise #1
• Analysis of a yeast protein (75 KDa)
treated with iodoacetamide,
trypsinized and subject to MALDI-TOF
• Go to “Worked Example 1” in your
notes to follow instructions
• Access your PMF data at:
[Link]

listed as [Link]
What Are Missed Cleavages?
Sequence Tryptic Fragments (no missed cleavage)
>Protein 1 acedfhsak (1007.4251)
acedfhsakdfqea
dfgeasdfpk (1183.5266)
sdfpkivtmeeewe
ndadnfekqwfe ivtmeeewendadnfek (2098.8909)
gwfe (609.2667)

Tryptic Fragments (1 missed cleavage)

acedfhsak (1007.4251)
dfgeasdfpk (1183.5266)
ivtmeeewendadnfek 2098.8909)
gwfe (609.2667)
acedfhsakdfgeasdfpk (2171.9338)
ivtmeeewendadnfekgwfe (2689.1398)
dfgeasdfpkivtmeeewendadnfek (3263.2997)
Mascot Databases
MASCOT Scoring
Why Probability-Based
Scoring?
• Will explain PBS later…
• Offers a simple numerical (and graphical)
assessment of whether a result is significant
• More reliable/accurate than simple mass or
peptide cut-off techniques
• Allows both MS and PMF data to be scored the
same way
• Scores from different searches or different
databases can be easily & directly compared
Mascot Scoring
• The statistics of peptide fragment
matching in MS (or PMF) is very similar to
the statistics used in BLAST
• The scoring probability appears to follow
an extreme value distribution
• High scoring segment pairs (in BLAST)
are analogous to high scoring mass
matches in Mascot
• Mascot scoring system is based on the
MOWSE scoring system
MOWSE
• MOlecular Weight SEarch
• Scoring system based on peptide
frequency distribution from the OWL
non redundant protein Database

Pappin DJC, Hojrup P, and Bleasby AJ (1993) Rapid

identification of proteins by peptide-mass
fingerprinting. Curr. Biol. 3:327-332
Bleasby
MOWSE
Sequence Mass (M+H) Tryptic Fragments
>Protein 1 acedfhsak
acedfhsakdfqea dfgeasdfpk
4842.05
sdfpkivtmeeewe ivtmeeewendadnfek
ndadnfekqwfe gwfe

>Protein 2 acek
acekdfhsadfqea dfhsadfgeasdfpk
4842.05
sdfpkivtmeeewe ivtmeeewenk
nkdadnfeqwfe dadnfeqwfe

>Protein 3 SQDDEIGDGTTGVVVLAGALLEEAEQLLDR2
DGDVTVTNDGATILSMMDVD HQIAK
MASMGTLAFD EYGRPFLIIK
MASMGTLAFDEYGRPFLIIK2
DQDRKSRLMG LEALKSHIM
TSLGPNGLDK
A AKAVANTMRT SLGPNGLD 14563.36 LMGLEALK
KMMVDKDGDVTV TNDGAT
LMVELSK
ILSM MDVDHQIAKL MVELS
AVANTMR
KSQDD EIGDGTTGVV VLAG
SHIMAAK
ALLEEAEQLLDRGIHP IRIAD
GIHPIR
MMVDK
DQDR
MOWSE
1. Group Proteins into 10 kDa ‘bins’.
>Protein 1
acedfhsakdfqea
4954.13
sdfpkivtmeeewe
ndadnfekqwfel

0-10 kDa >Protein 2

acekdfhsadfqea
5672.48
sdfpkivtmeeewe
nkdadnfeqwfekq
wfei

>Protein 3
MASMGTLAFD EYGRPFLIIK 14563.36
DQDRKSRLMG LEALKSHIM
10-20 kDa A AKAVANTMRT SLGPNGLD
KMMVDKDGDVTV TNDGAT
ILSM MDVDHQIAKL MVELS
KSQDD EIGDGTTGVV VLAG
ALLEEAEQLLDRGIHP IRIAD
MOWSE
2. For each protein, place fragments into 100 Da bins.

>Protein 1 Mol. Wt. Fragment Bin Fragment

acedfhsakdfqea 2098.8909 IVTMEEEWENDADNFEK 2000-2100 IVTMEEEWENDADNFEK
1183.5266 DFQEASDFPK 1900-2000
sdfpkivtmeeewe 1007.4251 ACEDFHSAK 1800-1900
722.3508 QWFEL DFHSADFQEASDFPK
ndadnfekqwfel 1700-1800
1600-1700
1500-1600
1400-1500 IVTMEEEWENK, DADNFEQWFE
>Protein 2 1300-1400
1200-1300
acekdfhsadfqea 1740.7500 DFHSADFQEASDFPK 1100-1200 DFQEASDFPK
sdfpkivtmeeewe 1407.6460 IVTMEEEWENK 1000-1100 ACEDFHSAK
1456.6127 DADNFEQWFEK 900-1000
nkdadnfeqwfekq 722.3508 QWFEI 800-900
700-800
wfei 600-700 QWFEL, QWFEI
500-600
400-500
MOWSE
The MOWSE frequency distribution plot looks like this:
MOWSE
3. Divide the number of fragments for each bin by the total
number of fragments for each 10 kDa protein interval
Bin Fragment Tot al Frequency
2000-2100 IVTMEEEWENDADNFEK 1 0.125
1900-2000 0 0.000
1800-1900 0 0.000
1700-1800 DFHSADFQEASDFPK 1 0.125
1600-1700 0 0.000
1500-1600 0 0.000
1400-1500 IVTMEEEWENK, DADNFEQWFE 2 0.250
1300-1400 0 0.000
1200-1300 0 0.000
1100-1200 DFQEASDFPK 1 0.125
1000-1100 ACEDFHSAK 1 0.125
900-1000 0 0.000
800-900 0 0.000
700-800 0 0.000
600-700 QWFEL, QWFEI 2 0.250
500-600 0 0.000
400-500 0 0.000
MOWSE
4. For each 10 kD interval, normalize to the largest
bin value
Bin Fragment Tot al Frequency Normalized
2000-2100 IVTMEEEWENDADNFEK 1 0.125 0.5
1900-2000 0 0.000 0
1800-1900 0 0.000 0
1700-1800 DFHSADFQEASDFPK 1 0.125 0.5
1600-1700 0 0.000 0
1500-1600 0 0.000 0
1400-1500 IVTMEEEWENK, DADNFEQWFE 2 0.250 1
1300-1400 0 0.000 0
1200-1300 0 0.000 0
1100-1200 DFQEASDFPK 1 0.125 0.5
1000-1100 ACEDFHSAK 1 0.125 0.5
900-1000 0 0.000 0
800-900 0 0.000 0
700-800 0 0.000 0
600-700 QWFEL, QWFEI 2 0.250 1
500-600 0 0.000 0
400-500 0 0.000 0
MOWSE
5. Compare spectrum masses against fragment mass
list for each protein in the database. Retrieve the
frequency score for each match and multiply.

Bin Fragment Tot al Frequency Normalized

2000-2100 IVTMEEEWENDADNFEK 1 0.125 0.5
1900-2000 0 0.000 0
1800-1900 0 0.000 0
1700-1800 DFHSADFQEASDFPK 1 0.125 0.5
1600-1700 0 0.000 0
1500-1600 0 0.000 0
1740.7500
1400-1500 IVTMEEEWENK, DADNFEQWFE 2 0.250 1
1456.6127 1300-1400 0 0.000 0
722.3508 1200-1300 0 0.000 0
1100-1200 DFQEASDFPK 1 0.125 0.5
1000-1100 ACEDFHSAK 1 0.125 0.5
900-1000 0 0.000 0
800-900 0 0.000 0
700-800 0 0.000 0

0.5 x 1 x 1 = 0.5 600-700

500-600
QWFEL, QWFEI 2
0
0.250
0.000
1
0
400-500 0 0.000 0
MOWSE
6. Invert and multiply, and normalize to an 'average'
protein of 50 000 k Da:

PN = product of distribution frequency scores

= 0.5 x 1 x 1 = 0.5

Score = 50 000 H = 'Hit' Protein MW

PN x H = 5672.48

= 50 000 = 17.62
0.5 x 5672.48
MOWSE
Takes into account relative abundance
of peptides in the database when
calculating scores
Protein size is compensated for
The model consists of numerous
spaces separated by 100 Da (the average
aa mass)
Does not provide a measure of
confidence for the prediction
MASCOT
• Probability-based MOWSE scoring
• The probability that the observed
match between experimental data and a
protein sequence is a random event is
approximately calculated for each
protein in the sequence database.
Probability model details not published
Perkins DN, Pappin DJC, Creasy DM, and Cottrell JS (1999) Probability-based
protein identification by searching sequence databases using mass spectrometry
data. Electrophoresis 20:3551-3567.
Mascot/Mowse Scoring
• The Mascot Score is given as S = -10*Log(P),
where P is the probability that the observed
match is a random event
• Try to aim for probabilities where P<0.05 (less
than a 5% chance the peptide mass match is
random)
• With today’s databases, Mascot scores
greater than 76 are significant (p<0.05)
• We show in the Mascot Lab that a score's
statistical significance is a complex function
of database size, mass window tolerance, etc.
Mascot Scoring
– The Mascot Score is given as S = -10*Log(P), where P is
the probability that observed match is a random event
– The significance of that result depends on the size of the
database being searched. Mascot shades in green the
insignificant hits using a P=0.05 cutoff

In this example,
scores less than 74 are
insignificant

Mascot Score:
120 = 1x10-12
Advantages of PMF
• Uses a “robust” & inexpensive form of MS
(MALDI)
• Doesn’t require too much sample optimization
• Can be done by a moderately skilled operator
(don’t need to be an MS expert)
• Widely supported by web servers
• Improves as DB’s get larger & instrumentation
gets better
• Very amenable to high throughput robotics
(up to 500 samples a day)
Limitations With PMF
• Requires that the protein of interest
already be in a sequence database
• Spurious or missing critical mass peaks
always lead to problems
• Mass resolution/accuracy is critical, best
to have <20 ppm mass resolution
• Generally found to only be about 40%
effective in positively identifying gel spots
Example #2 MS/MS
Identification of a Protein
from a Peptide Mixture
MS-MS for Protein ID
• Proteins are isolated (from gel or HPLC)
and subjected to tryptic digestion
• Peptides are sent through ionizer and into
a collision cell where the doubly charged
ions are selected and fragmented through
collision induced decay (CID)
• The resulting singly charged ions
(daughter ions) are analyzed to determine
the sequence or to ID the parent peptide
Why Trypsin for MS-MS?
• CID of peptides less than 2-3 kD is most
reliable for MS-MS studies – The
frequency of tryptic cleavage guarantees
that most peptides will be of this size
• Trypsin cleaves on the C-terminal side of
arginine and lysine. By putting the basic
residues at the C-terminus, peptides
fragment in a more predictable manner
throughout the length of the peptide
Why Double Charges?
• Easiest spectra to interpret are those
obtained from doubly-charged peptide
precursors, where the resulting fragment
ions are mostly singly-charged
• Doubly-charged precursors also fragment
such that most of the peptide bonds break
with comparable frequency, such that one
is more likely to derive a complete
sequence
MS-MS & Peptide Fragments
• When peptides are proteins are admitted
to a collision cell the peptide usually
fragments at the weakest bond (the
peptide bond, but some CH-NH and CH-
CO breakage also occurs)
• Collision conditions have to be optimized
for each peptide
• Two main types of daughter ions are
produced -- “b” ions and “y” ions
MS-MS Peptide Fragmentation
yn-1 yn-2 y1

R1 R2 R3 Rn

H2N-CH-CO-NH-CH-CO-NH-CH-CO…CO-NH-CH-CO2H

b1 b2 bn-1

b1 y1 b2 y2 b3 y3 b4 y4 b5 y5
signal
MS-MS Peptide Fragmentation
Ala-Gly-His-Leu-….Phe-Glu-Cys-Tyr

b1 y1 b2 y2 b3 y3 b4 y4 b5 y5
signal
Different MS-MS Instruments
Yield Different Spectra
• A typical QTOF or triple quad MS-MS
spectrum of a tryptic peptide contains a
continuous series of y-type ions. The b-type
ions are usually seen only at lower masses
below the precursor m/z value
• Ion trap CID data of tryptic peptides is
different in that one often finds a
continuous series of both b-type and y-type
ions throughout the spectrum
MS/MS – The Movie
(Kathleen Binns)
• [Link]
[Link]
Protein ID by MS-MS
• Peptide fragments from target protein are
sequenced by MS-MS using a variety of
algorithms (SEQUEST, Mascot) or via
manual methods
• The peptide fragment sequences are sent
to BLAST to be queried against a protein
sequence database
• The protein having the highest number of
sequence matches is ID’d as the target
MS-MS & Proteomics
Advantages Disadvantages
• Provides precise • Requires more handling,
sequence-specific data refinement and sample
• More informative than manipulation
PMF methods (>90%) • Requires more expensive
• Can be used for de- and complicated
novo sequencing (not equipment
entirely dependent on • Requires high level
databases) expertise
• Can be used to ID post- • Slower, not generally
trans. modifications high throughput
Mascot – MS/MS Query

click

[Link]
Exercise #2
• Analysis of a human nuclear protein
(65 KDa) treated with iodoacetamide
and trypsinized followed by MS/MS
• Go to “Worked Example 2” in your
notes to follow instructions
• Access your MS/MS data at:
[Link]

listed as [Link]
Mascot and MS/MS formats
• For MS/MS work, the data file must
contain 1 or more sets of MS/MS data
• Supported sets include:
• * Finnigan (.ASC)
• * Micromass (.PKL)
• * Sequest (.DTA)
• * PerSeptive (.PKS)
• * Sciex API III
• * Mascot Generic Format (.MGF)
Mascot Generic Format (MGF)
COM=10 pmol digest of Sample X15
ITOL=1
ITOLU=Da
MODS=Met Ox,Cys B propionamide
MASS=Monoisotopic
USERNAME=Lou Scene
USEREMAIL=leu@[Link]
CHARGE=2+ and 3+
BEGIN IONS Parent ion
Mass (2+)
Daughter ion TITLE=Peak 1
mass PEPMASS=983.6
846.60 73 intensity
846.80 44
847.60 67
Example #3 A “Hard”
MS/MS Problem
Exercise #3
• Analysis of a novel neuropeptide
hormone induced by music/sound
• No known or suspected PTMs
• Ion trap MS-MS spectrum – What is
it? What’s the sequence?
• Access your MS/MS data at:
[Link]

listed as [Link]
MS/MS Spectrum of
Neurosensin
What Do You Find?
Protocols for MS-MS
Sequencing
• Usually can’t tell a “b” ion from a “y” ion
• Assume the lowest mass visible in the
spectrum is a lysine or arginine (this is the
y1 ion) this is because trypsin cuts after a
lysine or arginine
• This y1 mass should be 147.113 for lysine
or 175.119 for arginine {The y1 ion is
calculated by adding 19.018 u (three
hydrogens and one oxygen) to the residue
masses of lysine and arginine}
MS-MS Sequencing
• Using the mass tables, look to the right of y1
and see if you can find another prominent
peak that is equal to y1 + AA where AA is the
residue mass for any of the 20 amino acids.
This is the y2 ion
• Proceed in a rightward direction, identifying
other yn ions that differ by an AA residue
mass (don’t expect to find all)
• The yn series produces a “reverse” sequence
• Watch for possible dipeptide peaks that may
fool you
Things To Remember
• Gly + Gly = 114.043 u and Asn = 114.043 u
• Ala + Gly = 128.059 u and Gln = 128.059 u
and Lys = 128.095 u
• Gly + Val = 156.090 u and Arg = 156.101 u
• Ala + Asp = Glu + Gly = 186.064 and Trp =
186.079 u
• Ser + Val = 186.100 u and Trp = 186.079 u
• Leu = Ile = 113.084u
MS-MS Sequencing
• Use the remaining “unassigned” peaks to
see if you can construct a “b” ion series
• The highest mass peak corresponds to the
parent ion or parent minus 147 (K) or 175 (R)
• The “b” ions give the “normal” sequence
• Both forward (b ion) and backward (y ion)
sequences should be consistent
• Use the resulting sequence tag to search the
databases using BLAST (remember to use a
high Expect value ~ 100) to see if the
sequence matches something
Conclusions
• Mascot is an excellent FREE resource
for doing PMF and MS/MS searches of
proteins
• Understanding the scoring scheme
and importance of database size (and
mass tolerance) is critical to using
Mascot optimally
• Not everything can be done on Mascot

Mascot Protein Identification Methods
No ratings yet
Mascot Protein Identification Methods
37 pages
Mascot: Take The Guesswork Out of Protein Identification..
No ratings yet
Mascot: Take The Guesswork Out of Protein Identification..
12 pages
Mascot: Take The Guesswork Out of Protein Identification..
No ratings yet
Mascot: Take The Guesswork Out of Protein Identification..
12 pages
Proteomics Part 1-2
No ratings yet
Proteomics Part 1-2
31 pages
Proteomics Workshop Sessions
No ratings yet
Proteomics Workshop Sessions
13 pages
Public Domain Proteomics Software
No ratings yet
Public Domain Proteomics Software
49 pages
Proteomics Data Analysis Workshop
No ratings yet
Proteomics Data Analysis Workshop
39 pages
Protein Identification Techniques
No ratings yet
Protein Identification Techniques
60 pages
3-Interpretation of Screening Results For Unknown Peptides and Proteins by MS Based Methods
No ratings yet
3-Interpretation of Screening Results For Unknown Peptides and Proteins by MS Based Methods
6 pages
DNA Protein: DNA Sequencing Protein Sequencing DNA Fingerprint (Restriction Map) Protein Fingerprint
No ratings yet
DNA Protein: DNA Sequencing Protein Sequencing DNA Fingerprint (Restriction Map) Protein Fingerprint
13 pages
PEAKS CompareSoftware
No ratings yet
PEAKS CompareSoftware
1 page
Proteomics & Mass Spectrometry Guide
No ratings yet
Proteomics & Mass Spectrometry Guide
65 pages
AI For Peptide Analysis
No ratings yet
AI For Peptide Analysis
8 pages
Protein Analysis Techniques
No ratings yet
Protein Analysis Techniques
26 pages
Proteomics: Techniques and Applications
No ratings yet
Proteomics: Techniques and Applications
39 pages
MSstats R Package for Proteomics
No ratings yet
MSstats R Package for Proteomics
55 pages
4241 Group Assignment 3
No ratings yet
4241 Group Assignment 3
2 pages
Lecture 03 Protein Sequence Analysis
No ratings yet
Lecture 03 Protein Sequence Analysis
69 pages
Bioinformatics in Mass Spectrometry Analysis
No ratings yet
Bioinformatics in Mass Spectrometry Analysis
15 pages
Bioinformatics Tools for Protein Analysis
No ratings yet
Bioinformatics Tools for Protein Analysis
24 pages
2D Gel Database Overview and Analysis
No ratings yet
2D Gel Database Overview and Analysis
17 pages
Protein Analysis Techniques Guide
No ratings yet
Protein Analysis Techniques Guide
60 pages
Alex
No ratings yet
Alex
27 pages
10.1007@s13361 015 1161 7
No ratings yet
10.1007@s13361 015 1161 7
7 pages
Advanced Proteomics Techniques
No ratings yet
Advanced Proteomics Techniques
39 pages
LC-MSsim-A Simulation Software
No ratings yet
LC-MSsim-A Simulation Software
18 pages
Femtomole Sensitivity in Proteomics
No ratings yet
Femtomole Sensitivity in Proteomics
38 pages
Proteinnet: A Standardized Data Set For Machine Learning of Protein Structure
No ratings yet
Proteinnet: A Standardized Data Set For Machine Learning of Protein Structure
10 pages
Proteomics & Metabolomics Intro
No ratings yet
Proteomics & Metabolomics Intro
64 pages
Comparative Evaluation of Mass Spectrometry Platforms Used in Large-Scale Proteomics Investigations
No ratings yet
Comparative Evaluation of Mass Spectrometry Platforms Used in Large-Scale Proteomics Investigations
9 pages
BioPharma Finder Host Cell Protein Analysis
No ratings yet
BioPharma Finder Host Cell Protein Analysis
57 pages
Secondary Databases
No ratings yet
Secondary Databases
21 pages
Mass Spectormenter
No ratings yet
Mass Spectormenter
1 page
Bioinformatics Tools for Protein Analysis
No ratings yet
Bioinformatics Tools for Protein Analysis
49 pages
Analyzing You Rprotein Using Bioinformatics Tools
No ratings yet
Analyzing You Rprotein Using Bioinformatics Tools
49 pages
Jenkins PDF
No ratings yet
Jenkins PDF
34 pages
Interpretation of Tandem Mass Spectrometry (MSMS) Spectra For Peptide Analysis
No ratings yet
Interpretation of Tandem Mass Spectrometry (MSMS) Spectra For Peptide Analysis
20 pages
Protein Structure Prediction Guide
No ratings yet
Protein Structure Prediction Guide
24 pages
MALDIquant Intro
No ratings yet
MALDIquant Intro
16 pages
Research Paper in The Domain of Drug Discovery
No ratings yet
Research Paper in The Domain of Drug Discovery
28 pages
Graph Based Signature
No ratings yet
Graph Based Signature
8 pages
G3835-90028 MassProfilerPro Application
No ratings yet
G3835-90028 MassProfilerPro Application
48 pages
MS-MS Analysis Programs - 2012 Slides
No ratings yet
MS-MS Analysis Programs - 2012 Slides
14 pages
2010 Bioinformatics 26 687-688
No ratings yet
2010 Bioinformatics 26 687-688
2 pages
Basic MS Identification and Interpretation
No ratings yet
Basic MS Identification and Interpretation
89 pages
GROMACS Molecular Dynamics Tutorial
No ratings yet
GROMACS Molecular Dynamics Tutorial
22 pages
Fundamentals of Biological MS and Proteomics Carr 5 15 PDF
No ratings yet
Fundamentals of Biological MS and Proteomics Carr 5 15 PDF
43 pages
An Introduction To Proteomics: The Protein Complement of The Genome
No ratings yet
An Introduction To Proteomics: The Protein Complement of The Genome
40 pages
Jcs Talk
No ratings yet
Jcs Talk
29 pages
Peptides
No ratings yet
Peptides
44 pages
Computational Proteomics: High-Throughput Analysis For Systems Biology
No ratings yet
Computational Proteomics: High-Throughput Analysis For Systems Biology
6 pages
CEFA8 D 01
No ratings yet
CEFA8 D 01
56 pages
PIIS1535947623001020
No ratings yet
PIIS1535947623001020
16 pages
BTX 285
No ratings yet
BTX 285
3 pages
Python Tool for Antimicrobial Peptides
No ratings yet
Python Tool for Antimicrobial Peptides
3 pages
Plag 2
No ratings yet
Plag 2
36 pages
Protein Seq Databases
No ratings yet
Protein Seq Databases
20 pages
Gibb 2012
No ratings yet
Gibb 2012
2 pages
DNA - RNA Nice Material
No ratings yet
DNA - RNA Nice Material
54 pages
Forms of Business Organizations
No ratings yet
Forms of Business Organizations
5 pages
DNA Model (Cut-Out Sheet)
No ratings yet
DNA Model (Cut-Out Sheet)
9 pages
Tools and Techniques Tools and Techniques Tools and Techniques Tools and Techniques
No ratings yet
Tools and Techniques Tools and Techniques Tools and Techniques Tools and Techniques
7 pages
Dna Replication Worksheet 1
100% (1)
Dna Replication Worksheet 1
2 pages
Dna Replication Worksheet 1
100% (1)
Dna Replication Worksheet 1
2 pages
Protein Synthesis Worksheet Guide
No ratings yet
Protein Synthesis Worksheet Guide
2 pages
Recombinant DNA Technology
No ratings yet
Recombinant DNA Technology
27 pages
Name - Per. - Date - Chapter 12-Protein Synthesis Worksheet
No ratings yet
Name - Per. - Date - Chapter 12-Protein Synthesis Worksheet
2 pages
Dna Ws
No ratings yet
Dna Ws
1 page
Critical Content Identification in SCIENCE
No ratings yet
Critical Content Identification in SCIENCE
4 pages
Balancing Chemical Equations 1
No ratings yet
Balancing Chemical Equations 1
4 pages
English Verb Tenses Worksheet PDF
No ratings yet
English Verb Tenses Worksheet PDF
1 page
DRRR Summative Assessment Week 1 and 2
No ratings yet
DRRR Summative Assessment Week 1 and 2
5 pages
Employment Background Check Consent
No ratings yet
Employment Background Check Consent
1 page
Activity 1 Vision, Mission, Core Values, and Objectives of The Institution
No ratings yet
Activity 1 Vision, Mission, Core Values, and Objectives of The Institution
31 pages
Lecture 16-Genome Editing Tools-2
No ratings yet
Lecture 16-Genome Editing Tools-2
10 pages
150 MCQ, Biochemistry, 2nd Sem
100% (1)
150 MCQ, Biochemistry, 2nd Sem
43 pages
Questions On Amino Acid Urea Cycle
No ratings yet
Questions On Amino Acid Urea Cycle
17 pages
Mechanism of Enzyme Action
No ratings yet
Mechanism of Enzyme Action
19 pages
Enzyme Specificity Explained
100% (2)
Enzyme Specificity Explained
6 pages
GFP Color
No ratings yet
GFP Color
1 page
Vincent D'Ambola-Siebecker - RNAProteinSynthesisSE
No ratings yet
Vincent D'Ambola-Siebecker - RNAProteinSynthesisSE
8 pages
Esthemax Cjenovnik
No ratings yet
Esthemax Cjenovnik
2 pages
BIOCHEMISTRY TRANSES - 20231108 - 211749 - 0000
No ratings yet
BIOCHEMISTRY TRANSES - 20231108 - 211749 - 0000
1 page
BIOLOGY Oneliners PYQ
No ratings yet
BIOLOGY Oneliners PYQ
34 pages
Amino Acids: Structure and Properties
No ratings yet
Amino Acids: Structure and Properties
21 pages
Genetic Code, Transcription and Translation
No ratings yet
Genetic Code, Transcription and Translation
39 pages
Nucleic Acids '24
No ratings yet
Nucleic Acids '24
25 pages
Lipids and Proteins Are Associated in Biological Membranes: What Is A Lipid
No ratings yet
Lipids and Proteins Are Associated in Biological Membranes: What Is A Lipid
20 pages
Metabolic Pathways
No ratings yet
Metabolic Pathways
19 pages
12th Class Chapter Wise QP 2024-25 PDF
No ratings yet
12th Class Chapter Wise QP 2024-25 PDF
11 pages
Biomolecules Hand Written
No ratings yet
Biomolecules Hand Written
15 pages
XP Maxamaid Fact Sheet SA - FC
No ratings yet
XP Maxamaid Fact Sheet SA - FC
2 pages
Medical Biochemistry II MCQ Test
No ratings yet
Medical Biochemistry II MCQ Test
18 pages
DNA Structure and X-Ray Diffraction Insights
100% (1)
DNA Structure and X-Ray Diffraction Insights
27 pages
Lipids and Lipid Metabolism
No ratings yet
Lipids and Lipid Metabolism
81 pages
Teaching Guide Biology 1
No ratings yet
Teaching Guide Biology 1
15 pages
Glutathione
100% (3)
Glutathione
22 pages
Biochemistry Exam Questions Guide
No ratings yet
Biochemistry Exam Questions Guide
3 pages
1.4 Proteins and Enzymes MS
No ratings yet
1.4 Proteins and Enzymes MS
11 pages
Trp Operon Mutation Effects
No ratings yet
Trp Operon Mutation Effects
4 pages
AP Bio: DNA Structure & Replication
No ratings yet
AP Bio: DNA Structure & Replication
8 pages
Bhatia LMRP PDF
100% (8)
Bhatia LMRP PDF
200 pages
Module 6 Fats
No ratings yet
Module 6 Fats
9 pages
Enzymes Worksheet
No ratings yet
Enzymes Worksheet
3 pages

Mass Spectrometry Peptide ID with MASCOT

Uploaded by

Mass Spectrometry Peptide ID with MASCOT

Uploaded by

Mass Spectrometric Peptide

500 1000 1500 2000 2500

Tryptic Fragments (1 missed cleavage)

Pappin DJC, Hojrup P, and Bleasby AJ (1993) Rapid

0-10 kDa >Protein 2

>Protein 1 Mol. Wt. Fragment Bin Fragment

Bin Fragment Tot al Frequency Normalized

0.5 x 1 x 1 = 0.5 600-700

PN = product of distribution frequency scores

Score = 50 000 H = 'Hit' Protein MW

You might also like