Walter Chazin 5140 BIOSCI/MRBIII E-mail: Walter.Chazin http://structbio.vanderbilt.edu/chazin
Jan. 8-10, 2003 Text Books
Branden and Tooze Introduction to Protein Structure
Voet, Voet and Pratt Fundamentals of Biochemistry
Stryer Biochemistry Proteins: Polymers of Amino Acids 20 different amino acids: many combinations Proteins are made in the RIBOSOME
Amino Acid Chemistry NH 2 C a
R 1 CO
H
NH
C a
R 2 COOH
H
NH 2 C a
R
COOH
H
amino acid 20 different types Amino acid Polypeptide Protein NH 2 C a
R 1 COOH
H
NH 2 C a
R 2 COOH
H
Amino Acid Chemistry NH 2 C a
R
COOH
H
amino acid The free amino and carboxylic acid groups have pKas COOH
COO - pKa ~ 2.2 NH 2 NH 3 + pKa ~ 9.4 At physiological pH, amino acids are zwitterions + NH 3 C a
R
COO - H
Amino Acid Chemistry Note the axes Also titratable groups in side chain
Glycine Gly - G
2.4 9.8 Alanine Ala - A
2.4 9.9 Valine Val - V
2.2 9.7 Leucine Leu - L
2.3 9.7 Isoleucine Ile - I
2.3 9.8 Amino Acids with Aliphatic R-Groups pKas Amino Acids with Polar R-Groups Non-Aromatic Amino Acids with Hydroxyl R-Groups Serine Ser - S
2.2 9.2 ~13 Threonine Thr - T
2.1 9.1 ~13 Amino Acids with Sulfur-Containing R-Groups Cysteine Cys - C
1.9 10.8 8.3 Methionine Met-M
2.1 9.3 Aspartic Acid Asp - D
2.0 9.9 3.9 Asparagine Asn - N
2.1 8.8 Glutamic Acid Glu - E
2.1 9.5 4.1 Glutamine Gln - Q
2.2 9.1 Acidic Amino Acids and Amide Conjugates Basic Amino Acids Arginine Arg - R
1.8 9.0 12.5 Lysine Lys - K
2.2 9.2 10.8 Histidine His - H
1.8 9.2 6.0 Aromatic Amino Acids and Proline Phenylalanine Phe - F
2.2 9.2 Tyrosine Tyr - Y
2.2 9.1 10.1 Tryptophan Trp-W
2.4 9.4 Proline Pro - P
2.0 10.6 Hierarchy of Protein Structure 20 different amino acids: many combinations The order of amino acids: Protein sequence Primary Structure Local conformation, depends on sequence Secondary Structure Overall structure of the chain(s) in full 3D Tertiary/Quaternary Structure Beyond Primary Structure: The Peptide Bond
-C - N-
O
=
-
H
-C = N-
O - -
-
H
Resonance structures Peptide plane is flat w angle ~180
Partial double-bond: Peptide bond Implications of Peptide Planes w angle varies little, f and angles vary alot Many f/ combinations cause atoms to collide Each residue is sandwiched between two planes C a H
R
f
Peptide planes C a H
R
C a Polypeptide Backbone Backbone restricted limited conformations Collisions with side chain groups further limit f/ combinations C a H
R
f
C a H
R
C a H
R
Secondary Structure Local Conformation of Consecutive Residues Three low energy backbone f/ combinations 1. Right-hand helix: a-helix (-40, -60) 2. Extended: antiparallel b-sheet (140, - 140) 3. Left-hand helix (rare): a-helix (45, 45) Glycine: special it has no side chain! Hydrogen bonds between backbone atoms provides stability to secondary structures Amino acids have specific preferences Secondary Structure- Helix H-bond Secondary Structure- Sheet Oxygen Nitrogen R Group Hydrogen Carbon a Carbonyl C H Bond Secondary Structure- Turn 1 4 3 2 Reverses direction of the chain Ribbon and Topology Diagrams Representations of Secondary Structures Sheets (arrows), Helices (cylinders) B/T- Figure 2.17 Ribbon and Topology Diagrams Organization of Secondary Structures helix B/T- Figure 2.11 Beyond Secondary Structure Supersecondary structure (motifs): small, discrete, commonly observed aggregates of secondary structures b sheet helix-loop-helix bab Domains: independent units of structure b barrel four-helix bundle *Domains and motifs sometimes interchanged* Protein Motifs V/V/P- Figure 6.28 Hairpin Motif B/T- Figure 2.14 Helix-Loop-Helix (H-L-H) Motif B/T- Figure 2.12 EF-Hand H-L-H Motif B/T- Figure 2.13 Greek Key Motif B/T- Figure 2.15 Multi-Domain (Modular) Proteins EGF Protease Kringle Ca-binding Protein Domain Tertiary Structure Definition: Overall 3D form of a molecule Organization of the secondary structures/ motifs/domains Optimization of interactions between residues A specific 3D structure is formed All proteins have multiple secondary structures, almost always multiple motifs, and in some cases multiple domains Tertiary Structure Specific structures result from long-range interactions Electrostatic (charged) interactions Hydrogen bonds (OH, N H, S H) Hydrophobic interactions Soluble proteins have an inside (core) and outside Folding driven by water- hydrophilic/phobic Side chain properties specify core/exterior Some interactions inside, others outside Tertiary Structure I. Ionic Interactions (exterior) Forms between 2 charged side chains: 1 Negative Glu,Asp 1 Positive Lys,Arg,His Also called salt bridges. Ionic interactions are pH-dependent (pKa). Occurs at the exterior NOTE: pKs for in the interior of a protein may be very different from free amino acid. Tertiary Structure II. Hydrogen bonds (interior and exterior) Forms between side chains/backbone/water: Charged side chains: Glu,Asp,His,Lys,Arg Polar chains: Ser,Thr,Cys,Asn,Gln,[Tyr,Trp] Not a specific covalent bond lower energy. Occurs inside, at the exterior, and with water. Tertiary Structure III. Hydrophobic Interactions (interior) Forms between side chains of non-polar residues: Aliphatic (Ala,Val,Leu,Ile,Pro,Met) Aromatic (Phe,Trp,[Tyr]) Clusters of side chains- but no requirement for a specific orientation like an H-bond In the protein interior, away from water Not pH dependent Tertiary Structure IV. Disulfide Bonds (interior and exterior) Forms between Cys residues: Cys-SH + HS-Cys Cys-S-S-Cys Catalyzed by specific enzymes, oxidizing agents Restricts flexibility of the protein Usually within a protein, less for linking proteins Disulfide Bonding V/V/P- Figure 16.6 Quaternary Structure Definition: Organization of multiple chain associations Oligomerization- Homo (self), Hetero (different) Used in organizing single proteins and protein machines Specific structures result from long-range interactions Electrostatic (charged) interactions Hydrogen bonds (OH, N H, S H) Hydrophobic interactions Disulfides only VERY infrequently Quaternary Structure The classic example- hemoglobin a 2 -b 2 B/T- Figure 3.7 END OF PART 1 Protein Structure from Sequence The pattern of amino acid side chains determines the local conformation and the global structure *Pattern is more important than exact sequence* A T V R L L E W E D L Reporting/Comparing Protein Sequences A T V R L L E Y K D L 5 10 h-CaM b-CaM conservative non-conservative Proteins Fold To Their Native Structure Folded proteins are only marginally stable!! ~0.4 kJmol -1 required to unfold (cf. ~20/H-bond) Balance loss of entropy vs. stabilizing forces Protein fold is specified by sequence Reversible reaction- denature (fold)/renature Even single mutations can cause changes Recent discovery that amyloid diseases (eg. CJD, Alzheimer) are due to unstable protein folding How Does a Protein Find Its Fold? A protein of n residues: 20 n possible sequences! 100 residue protein has 100 20 possibilities 1.3 X 10 130 ! The latest estimates indicate < 40,000 sequences in the human genome THERE MUST BE RULES! 20 different amino acids: many combinations N C 1 2 3 4 Amino terminus Carboxyl terminus Residue number Limitations on Protein Sequence Minimum length based on ability to perform a biochemical function: ~40 residues (e.g. inhibitors) Maximum length based on complexity of assembly: Conversion of DNA code and production of proteins is carried out by molecular machines that are not perfect. If the sequence gets too long, too many errors will build up. *Length is generally 100-1000 residues* Protein Folding The hydrophobic effect is the major driving force Hydrophobic side chains cluster/exclude water Release of water cages in unfolded state Other forces providing stability to the folded state Hydrogen bonds Electrostatic interactions Chemical cross links- Disulfides, metal ions Protein Folding Random folding has too many possibilities Backbone restricted but side chains not A 100 residue protein would require 10 87 s to search all conformations (age of universe < 10 18 s) Most proteins fold in less than 10 s!! Proteins must fold along specific pathways!! Protein Folding Pathways Usual order of folding events Secondary structures formed quickly (local) Secondary structures aggregate to form motifs Hydrophobic collapse to form domains Coalescence of domains Molecular chaperones assist folding in-vivo Complexity of large chains/multi-domains Cellular environment is rich in interacting molecules Chaperones sequester proteins and allow time to fold Progressive Folding of Proteins From Disordered to Native State Protein Folding Funnel V/V/P- Figures 6.37/38 Functional Classes of Proteins Receptors- sense stimuli, e.g. in neurons Channels- control cell contents Transport- e.g. hemoglobin in blood Storage- e.g. ferritin in liver Enzyme- catalyze biochemical reactions Cell function- multi-protein machines Structural- collagen in skin Immune response- antibodies Structural Classes of Proteins 1. Globular proteins (enzymes, molecular machines) Variety of secondary structures Approximately spherical shape Water soluble Function in dynamic roles (e.g. catalysis, regulation, transport, gene processing) Globular Proteins V/V/P- Figure 6.27 Hemoglobin a Conconavalin A Triose Phosphate isomerase Structural Classes of Proteins 2. Fibrous Proteins (fibrils, structural proteins) One dominating secondary structure Typically narrow, rod-like shape Poor water solubility Function in structural roles (e.g. cytoskeleton, bone, skin) Collagen: A Fibrous Protein V/V/P- Figures 6.17/18 Triple Helix Gly-Pro-Pro Repeat Stabilizing Inter-strand H-bonds Structural Classes of Proteins 3. Membrane Proteins (receptors, channels) Inserted into (through) membranes Multi-domain- membrane spanning, cytoplasmic, and extra-cellular domains Poor water solubility Function in cell communication (e.g. cell signaling, transport) Photosynthetic Reaction Center B/T Figure 13.6 Extracellular Intracellular (cytoplasmic) Membrane- spanning
I n the physical sense, the progression of living organisms results from the communication between molecules.
I nteraction between molecules is determined by binding affinities. Binding Classification of Proteins Structural- other structural proteins Receptors- regulatory proteins, transmitters Toxins- receptors Transport- O 2 /CO 2 , cholesterol, metals, sugars Storage- metals, amino acids, Enzymes- substrates, inhibitors, co-factors Cell function- proteins, RNA, DNA, metals, ions Immune response- foreign matter (antigens) Surface Determines What Binds 1. Steric access 2. Shape 3. Hydrophobic accessible surface 4. Electrostatic surface Sequence and structure optimized to generate surface properties for requisite binding event(s) Determinants of Protein Surface Function requires specific amino acid properties Not all amino acids are equally useful Abundant: Leu, Ala, Gly, Ser, Val, Glu Rare: Trp, Cys, Met, His Post-translational modifications Addition of co-factors- metals, hemes, etc. Chemical modification- phosphorylation, glycosylation, acetylation, ubiquination, sumoylation Binding Alters Protein Structure Mechanisms of Achieving Functional Properties 1. Allosteric Control- binding at one site effects changes in conformation or chemistry at a point distant in space 2. Stimulation/inhibition by control factors- proteins, ions, metals control progression of a biochemical process (e.g. controlling access to active site) 3. Reversible covalent modification- chemical bonding, e.g. phosphorylation (kinase/phosphatase) 4. Proteolytic activation/inactivation- irreversible, involves cleavage of one or more peptide bonds Calcium Signal Transduction Allostery & Stimulation by Control Factor Target Ca 2+
Calmodulin Sequence Structure Function Many sequences can give same structure Side chain pattern more important than sequence When homology is high (>50%), likely to have same structure and function (Structural Genomics) Cores conserved Surfaces and loops more variable *3-D shape more conserved than sequence* *There are a limited number of structural frameworks* I. Homologous: similar sequence (cytochrome c) Same structure Same function Modeling structure from homology Varied Relationships Between Sequence, Structure and Function V/V/P Figure 6.31 C-Type Cytochromes Same structure/function- Different Sequence Heme Constant structural elements and basic architecture Varied Relationships Between Sequence, Structure and Function I. Homologous: very similar sequence (cytochrome c) Same structure Same function Modeling structure from homology II. Similar function- different sequence (dehydrogenases) One domain same structure One domain different B/T Figure 10.8 NAD-Binding Domains Conserved Domains/Functional Elements Lactate Dehydrogenase Alcohol Dehydrogenase Varied Relationships Between Sequence, Structure and Function I. Homologous: very similar sequence (cytochrome c) Same structure Same function Modeling structure from homology II. Similar function- different sequence (dehydrogenases) One domain same structure One domain different III. Similar structure- different function (cf. thioredoxin) Same 3-D structure Not same function B/T Figures 10.8/2.7 NADH-Binding and Redox Same structure- Different Function Alcohol Dehydrogenase Lactate Dehydrogenase Thioredoxin