Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
w2 x2 + w1 x1 + w0 = 0
defined
over
a two
dimensional space, where
w1 x1
w̄ = x̄ =
w2 x2
An example line:
x2 + x1 − 1 = 0
Clearly points about the line yield
x2 + x1 − 1 > 0
while points below the line yield
x2 + x1 − 1 < 0
Pattern Classification Linear Discriminant functions Bayes Decision Theory
Discriminant Functions II
If the points belonging to the two classes C1 and C2 are as shown
in the Figure, they can be easily discriminated.
In general linear discriminant functions are of the form
wd xd + wd−1 xd−1 + ... + w1 x1 + w0 = 0
This represents a hyperplane in a d−dimensional space.
Alternatively can be written as
w̄ t x̄ + w0 = 0
x
2
(0,1)
x1
(1,0)
z1 = x 2
z2 = y2
r2 = 1
This yields
z1 + z2 − 1 = 0
This leads to a linear hyperplane in z−space that is isomorphic to
the input x−space.
oo z
2
o y o
o x x x o
x x x o
x x x x z1
o x x o
o
oo
o o
Classification steps:
I Training Process: We estimate µ̂i and Σ̂i using the dataset of
i th class: D(x~1 x~2 . . . x~N )
I Development Process: We fix our hyperparameters in this
process.
I Testing Process: We test our model using unseen data.
We classify the feature vector ~x to the class for which P(ωi /~x ) is
the highest and the rest becomes the error. Example: If we have M
classes and max is the answer, then error = 1 − P(ωmax /~x )
P( ω | x)
i
p(x| ω )P
P (( ω )) p(x| ω2 ) P (ω2 )
1 1
1 x −µ~i )Σ−1
− 1 (~ x −µ~i )T
p(~x /ωi ) = √ i (~
1 e 2
( 2π)d Σi 2
where,
1 1 P(ω1 )
(µ~1 − µ~2 )t ~x − 2 (µ~1 − µ~2 )t (µ~1 + µ~2 ) + ln
g (~x ) = 2
σ 2σ P(ω2 )
2
1 t 1 σ (µ~1 − µ~2 ) P(ω1 )
= 2 (µ~1 − µ~2 ) ~x − (µ~1 + µ~2 ) + ln
σ 2 kµ~1 − µ~2 k2 P(ω2 )
t
~ (~x − xo ) = 0 (i.e The separating plane passes through xo )
=ω
Now, if P(ω1 ) = P(ω2 ) then the boundary perpendicularly bisects
the line joining µ~1 and µ~2
Pattern Recognition Pattern Classification
Pattern Classification Linear Discriminant functions Bayes Decision Theory
2
σ1 0
CASE-2: C1 = C2 = C (σjk = 0 for j 6= k)
0 σ22
−1
gi (x) = (~x − µ~i )t Ci (~x − µ~i ) + ln P(wi )
2
−1 t −1 1 1 1
= ~x C ~x + µ~ti C −1 ~x + ~x t C −1 µ~i − µ~i t C −1 µ~i + ln P(ωi )
2 2 2 2
Ignoring the terms that do not depend on i i.e they will cancel out.
1
gi (~x ) = (C −1 µi )t ~x − µ~i t C −1 µ~i + ln P(ωi )
2
t
= ω~i ~x + ωio
Now, the discriminating boundary can be given by:
P(ω1 )
g (x) = (C −1 µ~1 − C −1 µ~2 )~x − µ~1 C −1 µ~1 + µ~2 C −1 µ~2 + ln
P(ω2 )
C −1 P(ω1 )
= C −1 (µ~1 − µ~2 )~x − (µ~1 − µ~2 )t (µ~1 + µ~2 ) + ln
2 P(ω2 )
~ t (~x − xo )
On comparing with equation of plane g (x) = ω
~ = C −1 (µ~1 − µ~2 )
ω
1 ln P(ω1 )/P(ω2 )
xo = (µ~1 + µ~2 ) −
2 (µ~1 − µ~2 )t C −1
g (~x ) = ~x t W ~x + ω ~ t ~x + ωo = 0 where,
−1 −1
W = (C1 − C2−1 )
2
ω = (C1−1 µ~1 − C2−1 µ~2 )
−1 1 C1 P(ω1 )
ωo = (µ~2 t C2−1 µ~2 + µ~1 t C1−1 µ~1 ) − ln| | + ln
2 2 C2 P(ω2 )
EigenVectors
Covariance 2D 3D nD
parallel to axis?
C = σ2I Circle Sphere Hypersphere Yes
C = Diagonal Ellipse Ellipsoid Hyperellipsoid Yes
C = Full Ellipse Ellipsoid Hyperellipsoid No