Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Glycine: Proline:
– low <Pα>, but high <PT> – low <Pα> and <Pβ>.
– great conformational – due to restricted
freedom. torsion angles.
– 3rd residue of a Type II turn. – α-helix, β-sheet breaker.
The Cooperativity of Nucleation
The cooperativity of 2o structure formation:
– i.e., the statistical unlikelihood of nucleation.
– is also included in the Chou-Fasman model:
• but, implicitly, in the rules of region assignment:
– e.g., whether a sub-sequence is helix, sheet, or coil.
Regions of 2o structure assigned by inspection:
– where any 2o structure requires a string of residues of similar
propensity.
– Example: For an α-helix:
• initiation of a helix requires a contiguous set of helix formers:
– H, h, or I… with I given ½-weight.
– clearly modeling the cooperativity of helix nucleation.
• nucleated helices propagate through residues, H, h, I, and i.
• and terminate when two or more helix breakers are encountered.
– again, modeling the cooperativity of the process.
Example: Chou-Fasman Method
Applied to the first 24 residues of Adenylate kinase.
– method predicts 2 structures:
• N-terminal string with α-helix forming tendency.
– mean weight: <Pα> = 1.39.
• 2nd string with both α-helix and β-sheet forming tendency.
– mean β-tendency higher: <Pβ> = 1.56.
– Experimentally, strings correspond to α-helix, β-sheet.
• A β-turn (specific coil) is also observed.
– predicted by a hydropathy-based modification by Rose (1978).
Example (cont.)
Applied to the remainder of Adenylate kinase:
– And also compared with a 2nd method (Nagano).
– best results provided by a joint method:
• here, obtains ~ 70% accuracy.
Evaluating Accuracy
The most widely used method:
– the overall, per-residue, 3-state accuracy (Q3):
Q3 = [(PH+PE+PC)/N] x 100%
• N = total number of residues.
• PX = number of correctly predicted residues in state X.
– X = α-Helix, β-shEet, or Coil.
– Although other methods exist,
• Q3 is the most conceptually simple.
Pioneering method by Chou-Fasman:
– overall accuracy of only about Q3 = 50%.
• as assessed by a database of 267 known structures.
– initially very popular, due to conceptual simplicity.
Improvements on Chou-Fasman
Many improvements have appeared.
– differ based on parameter definition and application.
– an in-depth consideration beyond the scope of this course.
• however, success correlated with the addition of relevant
statistical information…
[1] Information regarding residue context.
– i.e., The propensity of a residue to adopt a given state:
• determined by its n neighboring residues…
– as compared with observations in a database.
– We examine: the GOR method (Garnier, 1987):
[2] Information regarding homologous proteins.
– protein first subjected to multiple alignment.
• to identify homologous proteins.
– prediction then based on consensus propensities.
– We examine: the PHD method (Rost and Sander, 1993).
The GOR Method
Propensity of a residue to adopt state S:
– defined not only by its own identity:
• as in Chou-Fasman,
– but also by the identities of neighboring residues.
– GOR uses a 17-residue window:
• a central, predicted residue + 8 flanking residues
on each side.
• e.g., residues 4-20 used to predict the state of the 12th
residue (F) of adenylate kinase:
The GOR Method (cont.)
Using sequences in the database, 3 Scoring Matrices, MS were
first constructed:
– One for each of the 3 basic helical states, S = {H, E, C}.
– Each is a 20x17 matrix, with elements mxy:
• row, x = amino acid type (e.g., Ala).
• column, y = residue position within the ‘window’,
• mxy = the probability that residue y is of type x…
– given that the central residue is in state S
– So, the sum of the mxy values in each column is 1.
– Again, each matrix constructed in advance,
• from observed frequencies in the data base
– (e.g., from all known protein structures).