Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Prediction
K.Akila,
Lecturer,
Department of Bioinformatics
Jamal Mohamed College,
Trichy.
Proteins
Proteins play a crucial role in virtually all biological
processes with a broad range of functions.
The activity of an enzyme or the function of a
protein is governed by the three-dimensional
structure.
H11_MOUSE
histocompatibility antigen
VE2_BPV1
Bovine DNA-binding domain
20 amino acids - the building blocks
Sequence
Propensity
function
Other inputs
Vector of
Sequence
propensities
Prediction
Prediction
function
Other inputs
Primary structure
Primary structure refers to the "linear" sequence of amino acids.
Types of Secondary
Structures
α Helices
β Sheets
Loops
Coils
α Helix
Amino Acid Pα Pβ Pt
Glu 1.51 0.37 0.74
Met 1.45 1.05 0.60
Ala 1.42 0.83 0.66
Val 1.06 1.70 0.50
Ile 1.08 1.60 0.50
Tyr 0.69 1.47 1.14
Pro 0.57 0.55 1.52
Gly 0.57 0.75 1.56
Chou-Fasman method
A prediction is made for each type of
structure for each amino acid
– Can result in ambiguity if a region has high
propensities for both helix and sheet (higher
value usually chosen, with exceptions)
Chou-Fasman method
Calculation rules are somewhat ad hoc
Example: Method for helix
– Search for nucleating region where 4 out of 6
a.a. have Pα > 1.03
– Extend until 4 consecutive a.a. have an average
Pα < 1.00
– If region is at least 6 a.a. long, has an average
Pα > 1.03, and average Pα > average Pβ,
consider region to be helix
Accuracy of Chou-Fasman
predictions
Sequences whose 3D structures are known
are processed so that each residue is
“assigned” to a given secondary structure
class by looking at the backbone angles
Three classes most often used (helix=H,
sheet=E, turn=C) but sometimes use four
classes (helix, sheet, turn, loop)
Confusion matrix for Chou-
Fasman method on 78 proteins
Predicted H E C Unknown
True
H 47.5 3.0 4.3 45.2