Sei sulla pagina 1di 51

Proteomics and mass

spectrometry
Manimalha Balasubramani
Outline

Mass spectrometers
Protein identification
Quantitative proteomics
Protein-protein interactions
% Intensity

0
10
20
30
40
50
60
70
80
90
100

799.0
833.0566 842.4926
870.5201 878.4913
924.5113
965.4456
983.4860
1035.5696 1031.5374
1074.5405
1114.5428
1153.5334

1179.2
1191.6130
1232.5907
1254.5614 1258.5603
1280.5370
1315.5780 1303.7007

1360.7209
1395.7062

1475.7374

1559.4
1593.7693 1586.8064
A mass spectrum

1630.7738
1657.7953
1689.7865

1800.9324
1848.9419

(m/z)
Mass
1939.6

1964.8882

2021.9116

2120.9883
2169.9207
2211.0520
2319.8

2393.0823
2439.0872

2518.1062
2700.0
6.3E+4
Basically measures mass

Adapted from google


Components

Adapted from an Analytical chemistry textbook


Ionization process

MALDI Matrix Assisted Laser Desorption Ionization

ESI ElectroSpray Ionization

Nobel prize in Chemistry, 2002


MALDI Matrix Assisted Laser Desorption Ionization
ESI Electro Spray Ionization
Mass analyzers several designs

Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207


GPCL inventory
ABI Voyager DE PRO, walk-up use
ABI 4700 Proteomics Analyzer
Thermoelectron LCQ Deca with Surveyor
HPLC
ABI Qstar Elite with Ultimate 3000 HPLC
Bruker micrOTOF with Ultimate 3000 HPLC
Bruker 12 Tesla FTMS with Ultimate 3000
HPLC
Time-of-flight (TOF) analyzers

MALDI TOF ESI TOF


Voyager DE PRO Ultimate 3000 with micrOTOF
MALDI TOF - principle

KE = zeV = 1mv2
2
MS of serum albumin

ESI TOF

MALDI TOF
Tandem mass spectrometer

MALDI TOF/TOF

MS and MS/MS
Ion Trap

MS, MS2, MS3, .MSn


Quadrupole-q-TOF

ESI QqTOF
installation phase.

FT MS
bottom line

..Resolution and mass accuracy


FWHM
Full width at half maxima of a peak
Resolution and mass accuracy
m measured at
50% peak
height is the Full
Width at Half
Maxima (FWHM)

R= M
m
R = resolution
M = mass of the peak of interest
m = width in daltons of the peak
Mass accuracy is measured as parts per
million value
ppm = 106m = 106
M R
outline

Mass spectrometers
Protein identification
Quantitative proteomics
Protein-protein interactions
Peptide Mass Fingerprinting - PMF

Database entry
NCBI
From: http://gobi.ym.edu.tw/course/mass/2004-0325.pdf
Informatics
Search engines
Mascot, Matrix Science
Sequest, Thermoelectron

Free-ware
Protein prospector
(http://prospector.ucsf.edu/)
TPP tools
(http://tools.proteomecenter.org/TPP.php)
Database searching using MASCOT

Overview of the experiment


Submission of data to MASCOT
webserver
1D SDS PAGE of proteins

Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207


Intensity
% Intensity

10
20
30
40
50
60
70
80
90
100

0
699.0
789.5378

841.5205

898.5428
927.5582

1014.6827

1081.5479

1121.5520

1159.2
1163.7000
1195.6243

1249.6954
1283.7881
1305.7888

1399.7751
1433.8074 1439.8967

1479.8824
1516.7135
Mass spectrum

1554.7437
1567.8276
1590.8619

1619.4
1640.0277

1687.8691

1730.7723 1724.9272
1763.7820

Mass (m /z)
1881.0223
1895.0386

Mass to charge ratio (m/z)


2045.1273
4700 R eflector Spec #1 MC =>TR [B P = 1479.9, 15779]

2079.6

2262.0557

2458.3052
2493.3501
2539.8

2555.2903
3000.0
1.6E +4
Peak list
Compiled from the mass spectra
Mass list
Mass list and intensity
Submitted to the search engine
http://www.matrixscience.com/
Mascot scoring
A frequency factor matrix, F, is created, in which each row represents an interval of
100 Da in peptide mass, and each column an interval of 10 kDa in intact protein
mass. As each sequence entry is processed, the appropriate matrix elements fi,j are
incremented so as to accumulate statistics on the size distribution of peptide masses
as a function of protein mass. The elements of F are then normalised by dividing the
elements of each 10 kDa column by the largest value in that column to give the
Mowse factor matrix M:

After searching the experimental mass values against a calculated peptide mass
database, the score for each entry is calculated according to:

Where MProt is the molecular weight of the entry and the product term is calculated
from the Mowse factor elements for each match between the experimental data and
peptide masses calculated from the entry.
List of common contaminants
Trypsin autolysis peptides
Matrix peaks
Keratin from skin, hair
Other contaminants
Protein Identification

Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207


Tandem mass spectrum

http://qbab.aber.ac.uk
Tandem mass spectrum
4700 MS/MS Precursor 1570.7 Spec #1 MC[BP = 175.1, 3106]
175.1326
100
3105.9

90

1056.5107
80 1554.7853
1571.9679
70 684.3845

60
1556.5172
% Intensity

50

40

30 112.0977
1558.4042
813.4371
246.1672 333.2105
20 1559.9417
1441.7213
480.2749
316.1747 1039.4810 1570.2634
10 120.0979 463.2531 627.3450 741.3559 942.4836 1040.9976 1171.5131 1268.5427 1551.7002
72.1029 229.1560 400.2173 490.3423 629.3128 758.3326 910.8679 1445.2834
837.0470
0
69.0 386.8 704.6 1022.4 1340.2 1658.0
Mass (m /z)
Tandem mass spectra (MS/MS) can be used for peptide
sequencing

Database Searching
Peptide Mass Fingerprinting
Sequence tag approach

De novo sequencing
inspect raw data http://qbab.aber.ac.uk
Mascot Search Results
Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID:
15790, SpectrumID: 17225, Path=\Mani\102004\New Analysis 1
Database : NCBInr 20040606 (1846720 sequences; 611532004
residues)
Timestamp : 20 Oct 2004 at 14:52:50 GMT
Top Score : 681 for gi|180570, creatine kinase [Homo sapiens]

Probability Based Mowse Score

Score is -10*Log(P), where P is the probability that the observed match is a random
event. Protein scores greater than 75 are significant (p<0.05).
Top hits from Mascot Search there are
multiple accession numbers for the same protein
Accession Mass Score Description
1. gi|180570 42591 681 creatine kinase [Homo sapiens]
2. gi|21536286 42617 681 brain creatine kinase; creatine kinase-B [Homo sapiens]
3. gi|33304149 42730 681 creatine kinase, brain [synthetic construct]
4. gi|125292 42674 568 CREATINE KINASE, B CHAIN (B-CK) [Cannis familiaris]
5. gi|180572 42658 538 creatine kinase-B
6. gi|125295 42636 514 CREATINE KINASE, B CHAIN (B-CK)
7. gi|180555 42460 507 creatine kinase-B
8. gi|203476 40598 473 creatine kinase-B
9. gi|31542401 42685 471 creatine kinase, brain [Rattus norvegicus]
10. gi|203474 42699 471 creatine kinase
11. gi|40807002 44540 469 Unknown (protein for IMAGE:5598839) [Rattus norvegicus]
12. gi|47477783 44782 469 Ckb protein [Rattus norvegicus]
13. gi|13096153 42551 441 Chain A, Crystal Structure Of Bovine Retinal Creatine Kinase
14. gi|12852054 42700 427 unnamed protein product [Mus musculus]
15. gi|10946574 42686 427 creatine kinase, brain [Mus musculus]
16. gi|47213348 42953 237 unnamed protein product [Tetraodon nigroviridis]
17. gi|627264 40353 236 creatine kinase (EC 2.7.3.2) isozyme IV - African clawed frog
18. gi|27503418 42214 235 Ckb-prov protein [Xenopus laevis]
19. gi|45384340 42844 209 B-creatine kinase [Gallus gallus]
20. gi|6573489 42713 201 Chain A, Crystal Structure Of Chicken Brain-Type Creatine Kinase
Search returns a cluster of proteins with
the same matching peptides
1. gi|180570 Mass: 42591 Score: 681 creatine kinase [Homo sapiens]
Observed Mr(expt) Mr(calc) Delta Start End Miss Ions Peptide
1232.62 1231.61 1231.61 0.00 87 - 96 0 45 DLFDPIIEDR
1232.62 1231.61 1231.61 0.00 87 - 96 0 ---- DLFDPIIEDR
1254.57 1253.56 1253.58 -0.02 97 - 107 0 ---- HGGYKPSDEHK
1303.70 1302.70 1302.72 -0.02 33 - 43 0 ---- VLTPELYAELR
1303.70 1302.70 1302.72 -0.02 33 - 43 0 54 VLTPELYAELR
1458.70 1457.69 1457.67 0.02 139 - 151 1 ---- GFCLPPHCSRGER
1586.81 1585.80 1585.83 -0.03 157 - 172 0 81 LAVEALSSLDGDLAGR
1586.81 1585.80 1585.83 -0.03 157 - 172 0 ---- LAVEALSSLDGDLAGR
1656.79 1655.79 1655.82 -0.03 367 - 381 0 ---- LEQGQAIDDLMPAQK
1657.80 1656.79 1656.83 -0.04 224 - 236 0 47 TFLVWVNEEDHLR
1657.80 1656.79 1656.83 -0.04 224 - 236 0 ---- TFLVWVNEEDHLR
1848.94 1847.93 1847.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK
1864.93 1863.92 1863.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK
1964.88 1963.88 1963.92 -0.05 321 - 341 0 ---- GTGGVDTAAVGGVFDVSNADR
1964.88 1963.88 1963.92 -0.05 321 - 341 0 139 GTGGVDTAAVGGVFDVSNADR
2120.98 2119.97 2120.02 -0.05 320 - 341 1 ---- RGTGGVDTAAVGGVFDVSNADR
2120.98 2119.97 2120.02 -0.05 320 - 341 1 27 RGTGGVDTAAVGGVFDVSNADR
2169.91 2168.91 2168.96 -0.05 14 - 32 0 ---- FPAEDEFPDLSAHNNHMAK
2225.06 2224.05 2224.17 -0.12 157 - 177 1 ---- LAVEALSSLDGDLAGRYYALK
2439.08 2438.07 2438.14 -0.07 12 - 32 1 31 LRFPAEDEFPDLSAHNNHMAK
2439.08 2438.07 2438.14 -0.07 12 - 32 1 ---- LRFPAEDEFPDLSAHNNHMAK
2518.10 2517.09 2517.16 -0.07 108 - 130 0 92 TDLNPDNLQGGDDLDPNYVLSSR
2518.10 2517.09 2517.16 -0.07 108 - 130 0 ---- TDLNPDNLQGGDDLDPNYVLSSR
3753.61 3752.60 3752.73 -0.13 97 - 130 1 ---- HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR
3753.61 3752.60 3752.73 -0.13 97 - 130 1 55 HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR

4. gi|125292 Mass: 42674 Score: 568 CREATINE KINASE, B CHAIN (B-CK)


Observed Mr(expt) Mr(calc) Delta Start End Miss Ions Peptide
1254.57 1253.56 1253.58 -0.02 97 - 107 0 ---- HGGYKPSDEHK
1303.70 1302.70 1302.72 -0.02 33 - 43 0 ---- VLTPELYAELR
1303.70 1302.70 1302.72 -0.02 33 - 43 0 54 VLTPELYAELR
1458.70 1457.69 1457.67 0.02 139 - 151 1 ---- GFCLPPHCSRGER
1586.81 1585.80 1585.83 -0.03 157 - 172 0 81 LAVEALSSLDGDLAGR
1586.81 1585.80 1585.83 -0.03 157 - 172 0 ---- LAVEALSSLDGDLAGR
1624.76 1623.75 1623.85 -0.10 367 - 381 0 ---- LEQGQAIDDLVPAQK
1848.94 1847.93 1847.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK
1864.93 1863.92 1863.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK
1964.88 1963.88 1963.92 -0.05 321 - 341 0 ---- GTGGVDTAAVGGVFDVSNADR
1964.88 1963.88 1963.92 -0.05 321 - 341 0 139 GTGGVDTAAVGGVFDVSNADR
2120.98 2119.97 2120.02 -0.05 320 - 341 1 ---- RGTGGVDTAAVGGVFDVSNADR
2120.98 2119.97 2120.02 -0.05 320 - 341 1 27 RGTGGVDTAAVGGVFDVSNADR
2169.91 2168.91 2168.96 -0.05 14 - 32 0 ---- FPAEDEFPDLSAHNNHMAK
2225.06 2224.05 2224.17 -0.12 157 - 177 1 ---- LAVEALSSLDGDLAGRYYALK
2439.08 2438.07 2438.14 -0.07 12 - 32 1 31 LRFPAEDEFPDLSAHNNHMAK
2439.08 2438.07 2438.14 -0.07 12 - 32 1 ---- LRFPAEDEFPDLSAHNNHMAK
2518.10 2517.09 2517.16 -0.07 108 - 130 0 92 TDLNPDNLQGGDDLDPNYVLSSR
2518.10 2517.09 2517.16 -0.07 108 - 130 0 ---- TDLNPDNLQGGDDLDPNYVLSSR
3753.61 3752.60 3752.73 -0.13 97 - 130 1 ---- HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR
3753.61 3752.60 3752.73 -0.13 97 - 130 1 55 HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR
Creatine kinase B is the highest scoring
protein
Match to: gi|21536286 ; Score: 681
Creatine kinase - B [Homo sapiens]
Nominal mass (Mr): 42591; Calculated pI value: 5.34
Observed Mass & pI: 43kd, 6.2-6.27
Sequence Coverage: 46%
1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG
51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY
101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE
151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP
201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF
251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP
301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV
351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K
outline

Mass spectrometers
Protein identification
Quantitative proteomics
Protein-protein interactions
Quantitative Proteomics

Sample preparation
From 2D gels .to MALDI or ESI MS
Control Test

Pool

Cy3 Cy5

Image analysis with Delta2D, Decodon


Quantitate
Export spot list to robotic picker
..its high-throughput
1st Dimension - Isoelectric focussing

2nd Dimension SDS PAGE

Spot picking
Trypsin gel digest
Colorectal cancer markers
Isolate Mass spectral analysis
Nuclear Matrix
MS
In-gel
Tryptic digest

m/z
MS/MS
Tumor specific markers Database
CC3, CC4, CC5, CC6a, CC6b Search
m/z
Protein
2D Identified
Immunoblotting Validation
1 Yes No
D Immunohistochemistry de novo sequencing

Balasubramani et al., Cancer Res., 2006


Shotgun proteomics

Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207


typical workflow to identify biomarkers that distinguish
indolent versus aggressive forms of cancer..

Group A, Indolent Group B, Aggressive

Fractionate Fractionate
Eg. Immunodeplete, subcellular Eg. Immunodeplete, subcellular

Tryptic peptides Tryptic peptides

Label with iTRAQ reagent 115 Label with iTRAQ reagent 116

Combine labeled digests


LC fractionate

MS and MS/MS

Protein ID and Quantitate


Sample handling

In-solution
1D or 2D LC MALDI
Isoelectric
focussing

HPLC
Protein-protein interaction studies
Immunoaffinity pull-downs
Tandem affinity purification
GPCL
Billy W Day
Paul Wood

Mirunalni Thangavelu
Tamanna Sultana
Emanuel M Schreiber
Chris Bolcato
Chris Myers

Patrick Miller
Robert Wolfe
definitions
The amu is defined as 1/12th the mass of
one neutral 6C12 atom
Amu is also called the dalton
1 amu =1/12 ( 12g 12C/mol 12C
6.0221 x 1023 atoms
12C/mol 12C

1.6605 x 10-24 g/atom 12C


Isotopic species of M
(M + H)+ (M + 1H)/1H+

(M + 2H)2+ (M + 2H)/2H+

(M + 3H)3+ (M + 3H)/3H+

Potrebbero piacerti anche