Sei sulla pagina 1di 65

Biochemistry 301

Principles of Protein Structure


Walter Chazin
5140 BIOSCI/MRBIII
E-mail: Walter.Chazin
http://structbio.vanderbilt.edu/chazin

Jan. 8-10, 2003
Text Books


Branden and Tooze
Introduction to Protein Structure

Voet, Voet and Pratt
Fundamentals of Biochemistry

Stryer
Biochemistry
Proteins: Polymers of Amino Acids
20 different amino acids: many combinations
Proteins are made in the RIBOSOME




Amino Acid Chemistry
NH
2
C
a

R
1
CO

H

NH

C
a

R
2
COOH

H

NH
2
C
a

R

COOH

H

amino
acid
20 different types
Amino acid Polypeptide Protein
NH
2
C
a

R
1
COOH

H

NH
2
C
a

R
2
COOH

H

Amino Acid Chemistry
NH
2
C
a

R

COOH

H

amino
acid
The free amino and carboxylic acid groups have pKas
COOH

COO
-
pKa ~ 2.2
NH
2
NH
3
+
pKa ~ 9.4
At physiological pH, amino acids are zwitterions
+
NH
3
C
a

R

COO
-
H

Amino Acid Chemistry
Note the axes
Also titratable
groups in side chain


Glycine Gly - G

2.4 9.8
Alanine Ala - A

2.4 9.9
Valine Val - V

2.2 9.7
Leucine Leu - L

2.3 9.7
Isoleucine Ile - I

2.3 9.8
Amino Acids with Aliphatic R-Groups
pKas
Amino Acids with Polar R-Groups
Non-Aromatic Amino Acids with Hydroxyl R-Groups
Serine Ser - S

2.2 9.2 ~13
Threonine Thr - T

2.1 9.1 ~13
Amino Acids with Sulfur-Containing R-Groups
Cysteine Cys - C

1.9 10.8 8.3
Methionine Met-M

2.1 9.3
Aspartic Acid Asp - D

2.0 9.9 3.9
Asparagine Asn - N

2.1 8.8
Glutamic Acid Glu - E

2.1 9.5 4.1
Glutamine Gln - Q

2.2 9.1
Acidic Amino Acids and Amide Conjugates
Basic Amino Acids
Arginine Arg - R

1.8 9.0 12.5
Lysine Lys - K

2.2 9.2 10.8
Histidine His - H

1.8 9.2 6.0
Aromatic Amino Acids and Proline
Phenylalanine Phe - F

2.2 9.2
Tyrosine Tyr - Y

2.2 9.1 10.1
Tryptophan Trp-W

2.4 9.4
Proline Pro - P

2.0 10.6
Hierarchy of Protein Structure
20 different amino acids: many combinations
The order of amino acids: Protein sequence
Primary Structure
Local conformation, depends on sequence
Secondary Structure
Overall structure of the chain(s) in full 3D
Tertiary/Quaternary Structure
Beyond Primary Structure:
The Peptide Bond


-C - N-

O

=

-

H

-C = N-

O
-
-

-

H

Resonance structures
Peptide plane is flat
w angle ~180

Partial double-bond:
Peptide bond
Implications of Peptide Planes
w angle varies little, f and angles vary alot
Many f/ combinations cause atoms to collide
Each residue is sandwiched between two planes
C
a
H

R

f

Peptide planes
C
a
H

R

C
a
Polypeptide Backbone
Backbone restricted limited conformations
Collisions with side chain groups further limit f/
combinations
C
a
H

R

f

C
a
H

R

C
a
H

R

Secondary Structure
Local Conformation of Consecutive Residues
Three low energy backbone f/ combinations
1. Right-hand helix: a-helix (-40, -60)
2. Extended: antiparallel b-sheet (140, -
140)
3. Left-hand helix (rare): a-helix (45, 45)
Glycine: special it has no side chain!
Hydrogen bonds between backbone atoms
provides stability to secondary structures
Amino acids have specific preferences
Secondary Structure- Helix
H-bond
Secondary Structure- Sheet
Oxygen Nitrogen
R Group
Hydrogen
Carbon a
Carbonyl C
H Bond
Secondary Structure- Turn
1
4 3
2
Reverses direction of the chain
Ribbon and Topology Diagrams
Representations of Secondary Structures
Sheets (arrows), Helices (cylinders)
B/T- Figure 2.17
Ribbon and Topology Diagrams
Organization of Secondary Structures
helix
B/T- Figure 2.11
Beyond Secondary Structure
Supersecondary structure (motifs): small,
discrete, commonly observed aggregates of
secondary structures
b sheet
helix-loop-helix
bab
Domains: independent units of structure
b barrel
four-helix bundle
*Domains and motifs sometimes interchanged*
Protein Motifs
V/V/P- Figure 6.28
Hairpin Motif
B/T- Figure 2.14
Helix-Loop-Helix (H-L-H) Motif
B/T- Figure 2.12
EF-Hand H-L-H Motif
B/T- Figure 2.13
Greek Key Motif
B/T- Figure 2.15
Multi-Domain (Modular) Proteins
EGF
Protease
Kringle
Ca-binding
Protein
Domain
Tertiary Structure
Definition: Overall 3D form of a molecule
Organization of the secondary structures/
motifs/domains
Optimization of interactions between residues
A specific 3D structure is formed
All proteins have multiple secondary
structures, almost always multiple motifs, and
in some cases multiple domains
Tertiary Structure
Specific structures result from long-range interactions
Electrostatic (charged) interactions
Hydrogen bonds (OH, N H, S H)
Hydrophobic interactions
Soluble proteins have an inside (core) and outside
Folding driven by water- hydrophilic/phobic
Side chain properties specify core/exterior
Some interactions inside, others outside
Tertiary Structure
I. Ionic Interactions (exterior)
Forms between 2 charged side chains:
1 Negative Glu,Asp 1 Positive Lys,Arg,His
Also called salt bridges.
Ionic interactions are pH-dependent (pKa).
Occurs at the exterior
NOTE: pKs for in the interior of a protein may be
very different from free amino acid.
Tertiary Structure
II. Hydrogen bonds (interior and exterior)
Forms between side chains/backbone/water:
Charged side chains: Glu,Asp,His,Lys,Arg
Polar chains: Ser,Thr,Cys,Asn,Gln,[Tyr,Trp]
Not a specific covalent bond lower energy.
Occurs inside, at the exterior, and with water.
Tertiary Structure
III. Hydrophobic Interactions (interior)
Forms between side chains of non-polar residues:
Aliphatic (Ala,Val,Leu,Ile,Pro,Met)
Aromatic (Phe,Trp,[Tyr])
Clusters of side chains- but no requirement for
a specific orientation like an H-bond
In the protein interior, away from water
Not pH dependent
Tertiary Structure
IV. Disulfide Bonds (interior and exterior)
Forms between Cys residues:
Cys-SH + HS-Cys Cys-S-S-Cys
Catalyzed by specific enzymes, oxidizing agents
Restricts flexibility of the protein
Usually within a protein, less for linking proteins
Disulfide Bonding
V/V/P- Figure 16.6
Quaternary Structure
Definition: Organization of multiple chain associations
Oligomerization- Homo (self), Hetero (different)
Used in organizing single proteins and protein
machines
Specific structures result from long-range interactions
Electrostatic (charged) interactions
Hydrogen bonds (OH, N H, S H)
Hydrophobic interactions
Disulfides only VERY infrequently
Quaternary Structure
The classic example- hemoglobin a
2
-b
2
B/T- Figure 3.7
END OF PART 1
Protein Structure from Sequence
The pattern of amino acid side chains determines
the local conformation and the global structure
*Pattern is more important than exact sequence*
A T V R L L E W E D L
Reporting/Comparing Protein Sequences
A T V R L L E Y K D L
5 10
h-CaM
b-CaM
conservative non-conservative
Proteins Fold To Their
Native Structure
Folded proteins are only marginally stable!!
~0.4 kJmol
-1
required to unfold (cf. ~20/H-bond)
Balance loss of entropy vs. stabilizing forces
Protein fold is specified by sequence
Reversible reaction- denature (fold)/renature
Even single mutations can cause changes
Recent discovery that amyloid diseases (eg. CJD,
Alzheimer) are due to unstable protein folding
How Does a Protein Find Its Fold?
A protein of n residues: 20
n
possible sequences!
100 residue protein has 100
20
possibilities 1.3 X 10
130
!
The latest estimates indicate < 40,000
sequences in the human genome
THERE MUST BE RULES!
20 different amino acids: many combinations
N C
1 2 3 4
Amino terminus Carboxyl terminus
Residue number
Limitations on Protein Sequence
Minimum length based on ability to perform a
biochemical function: ~40 residues (e.g. inhibitors)
Maximum length based on complexity of assembly:
Conversion of DNA code and production of proteins
is carried out by molecular machines that are not
perfect. If the sequence gets too long, too many
errors will build up.
*Length is generally 100-1000 residues*
Protein Folding
The hydrophobic effect is the major driving force
Hydrophobic side chains cluster/exclude water
Release of water cages in unfolded state
Other forces providing stability to the folded state
Hydrogen bonds
Electrostatic interactions
Chemical cross links- Disulfides, metal ions
Protein Folding
Random folding has too many possibilities
Backbone restricted but side chains not
A 100 residue protein would require 10
87
s to
search all conformations (age of universe < 10
18
s)
Most proteins fold in less than 10 s!!
Proteins must fold along specific pathways!!
Protein Folding Pathways
Usual order of folding events
Secondary structures formed quickly (local)
Secondary structures aggregate to form motifs
Hydrophobic collapse to form domains
Coalescence of domains
Molecular chaperones assist folding in-vivo
Complexity of large chains/multi-domains
Cellular environment is rich in interacting
molecules Chaperones sequester proteins and
allow time to fold
Progressive Folding of Proteins
From Disordered to Native State
Protein Folding Funnel
V/V/P- Figures 6.37/38
Functional Classes of Proteins
Receptors- sense stimuli, e.g. in neurons
Channels- control cell contents
Transport- e.g. hemoglobin in blood
Storage- e.g. ferritin in liver
Enzyme- catalyze biochemical reactions
Cell function- multi-protein machines
Structural- collagen in skin
Immune response- antibodies
Structural Classes of Proteins
1. Globular proteins (enzymes, molecular machines)
Variety of secondary structures
Approximately spherical shape
Water soluble
Function in dynamic roles (e.g. catalysis,
regulation, transport, gene processing)
Globular Proteins
V/V/P- Figure 6.27
Hemoglobin a Conconavalin A Triose Phosphate isomerase
Structural Classes of Proteins
2. Fibrous Proteins (fibrils, structural proteins)
One dominating secondary structure
Typically narrow, rod-like shape
Poor water solubility
Function in structural roles (e.g. cytoskeleton,
bone, skin)
Collagen: A Fibrous Protein
V/V/P- Figures 6.17/18
Triple Helix
Gly-Pro-Pro Repeat
Stabilizing
Inter-strand
H-bonds
Structural Classes of Proteins
3. Membrane Proteins (receptors, channels)
Inserted into (through) membranes
Multi-domain- membrane spanning,
cytoplasmic, and extra-cellular domains
Poor water solubility
Function in cell communication (e.g. cell
signaling, transport)
Photosynthetic Reaction Center
B/T Figure 13.6
Extracellular
Intracellular
(cytoplasmic)
Membrane-
spanning


I n the physical sense, the
progression of living organisms
results from the communication
between molecules.

I nteraction between molecules is
determined by binding affinities.
Binding Classification of Proteins
Structural- other structural proteins
Receptors- regulatory proteins, transmitters
Toxins- receptors
Transport- O
2
/CO
2
, cholesterol, metals, sugars
Storage- metals, amino acids,
Enzymes- substrates, inhibitors, co-factors
Cell function- proteins, RNA, DNA, metals, ions
Immune response- foreign matter (antigens)
Surface Determines What Binds
1. Steric access
2. Shape
3. Hydrophobic
accessible surface
4. Electrostatic surface
Sequence and structure optimized to generate
surface properties for requisite binding event(s)
Determinants of Protein Surface
Function requires specific amino acid properties
Not all amino acids are equally useful
Abundant: Leu, Ala, Gly, Ser, Val, Glu
Rare: Trp, Cys, Met, His
Post-translational modifications
Addition of co-factors- metals, hemes, etc.
Chemical modification- phosphorylation,
glycosylation, acetylation, ubiquination,
sumoylation
Binding Alters Protein Structure
Mechanisms of Achieving Functional Properties
1. Allosteric Control- binding at one site effects changes
in conformation or chemistry at a point distant in space
2. Stimulation/inhibition by control factors- proteins, ions,
metals control progression of a biochemical process
(e.g. controlling access to active site)
3. Reversible covalent modification- chemical bonding,
e.g. phosphorylation (kinase/phosphatase)
4. Proteolytic activation/inactivation- irreversible, involves
cleavage of one or more peptide bonds
Calcium Signal Transduction
Allostery & Stimulation by Control Factor
Target
Ca
2+

Calmodulin
Sequence Structure Function
Many sequences can give same structure
Side chain pattern more important than
sequence
When homology is high (>50%), likely to have same
structure and function (Structural Genomics)
Cores conserved
Surfaces and loops more variable
*3-D shape more conserved than sequence*
*There are a limited number of structural frameworks*
I. Homologous: similar sequence (cytochrome c)
Same structure
Same function
Modeling structure from homology
Varied Relationships Between
Sequence, Structure and Function
V/V/P Figure 6.31
C-Type Cytochromes
Same structure/function- Different Sequence
Heme
Constant structural elements and basic architecture
Varied Relationships Between
Sequence, Structure and Function
I. Homologous: very similar sequence (cytochrome c)
Same structure
Same function
Modeling structure from homology
II. Similar function- different sequence (dehydrogenases)
One domain same structure
One domain different
B/T Figure 10.8
NAD-Binding Domains
Conserved Domains/Functional Elements
Lactate Dehydrogenase Alcohol Dehydrogenase
Varied Relationships Between
Sequence, Structure and Function
I. Homologous: very similar sequence (cytochrome c)
Same structure
Same function
Modeling structure from homology
II. Similar function- different sequence (dehydrogenases)
One domain same structure
One domain different
III. Similar structure- different function (cf. thioredoxin)
Same 3-D structure
Not same function
B/T Figures 10.8/2.7
NADH-Binding and Redox
Same structure- Different Function
Alcohol Dehydrogenase Lactate Dehydrogenase
Thioredoxin

Potrebbero piacerti anche