Sei sulla pagina 1di 37

Principal Component Analysis (PCA)

Tushar Jaruhar | Founder, SYMPLFYD


IPL on Mobile

3 Dimension to 2 Dimension
Sad or Happy?
Sad or Happy?
SAD HAPPY

The way you look at Data counts


Analysing Data
 In a 3D space data is arranged on
Income
Experience
3 axis – Experience, Income and
Age
 For 5 variables it would be having
5 axis
 Number of variables = Number
Height of Axis = Number of Dimension
 Challenge: How do we recognize
patterns in n-dimension plane?

Weight

Age
Analysing Data
 The BLUE line passing through
Income the data captures the
Experience DIRECTION of maximum
Variation AND the
MAGNITUDE of maximum
variation
 The RED line is perpendicular
to the BLUE Line and it
captures the DIRECTION and
MAGNITUDE of the Second
Highest Variation
 The GREEN Line captures the
DIRECTION and MAGNITUDE
of Third Highest Variation

Age
Dimension Reduction 80%-90% Variation Captured along 2
Dimensions: PC1 and PC2
100% Variation Captured
Income along 3 Dimensions 10% - 20% of Variation along Green
Experience Line is lost as the 3rd dimension has
been removed
3D 2D PC2

PC1

Note: Age, Experience and Income is not


the Axis
It has been replaced by PC1 and PC2
Age The data color has changed as some
information has been discarded
Data and Information
 Suppose, I have data on 1000 variables such as – Income, Age, Experience Level,
Education, Gender etc.

 Information: “What is conveyed or represented by a particular arrangement or


sequence of things”

 Do we need so many data variables to extract the relevant information for our
business problem?

 Data can be highly correlated for example Experience Level and Age

 Knowing years of experience -> indicative age range can be obtained

 So, do we need both experience level and age data?


Applications
 Facial Recognition

 Engineering

 Google Search

 Reduction in Number of Variables

 Removing Noise (Redundancy)


Eigen Vector and Eigen Values

An important relationship
Example
Direction (Ф) of the vector did not
change.
Only the vector became longer
Same
A𝒗
4
0 1 1 2 1
= =2 x
2 1 2 𝒗
2 4 2

Ф
A v k λ v
1 2

When a VECTOR (𝒗) is multiplied by a matrix (𝑨) and the


𝑨𝒗= 𝝀𝒗 resultant is the product of a Scalar (𝝀) and Vector (𝒗)

Then, the vector (𝒗) is called Eigen Vector …and ….the


Scalar (𝝀) is called Eigen Value
Eigen Vector and Eigen Value
 The term Eigen is a German word and it means “mine” or “my”

 Vector has magnitude and direction

 When a Matrix is multiplied by its Eigen Vector it results in the Eigen Vector
being multiplied by a Scalar. This Scalar is called Eigen Value

 Each and every Eigen Value will have an Eigen Vector associated to it

 The relationship Av = λv allows us to extract Eigen Values and Associated Eigen


Vectors
Role of Eigen Vector and Eigen Value in Principal
Component Analysis
 The 1st Eigen Vector is defined as the
direction in which data has Maximum
5 Variance – y1
2nd Principal 1st Principal  Principal Component 1
Component, y2 Component, y1
 The 2nd Eigen Vector is defined as the
4 direction in which data has the 2nd
Highest Variance – y2
 Principal Component 2

 Both principal component are orthogonal


3 or at 90 degrees to each other as per the
PCA model
 In this model data has only 2 Dimensions
and hence the principal components are 2
 What would happen if data had 100
2 Dimensions?
4.0 4.5 5.0 5.5 6.0
Role of Eigen Vector and Eigen Value in Principal
Component Analysis
 The 1st Eigen Value is defined as
the magnitude of the Maximum
5
2nd Principal 1st Principal Variance along Principal
Component, y2 Component, y1 Component 1 denoted by λ1

λ1 λ2  The 2nd Eigen Value is defined as


4
the magnitude of the 2nd Highest
Variance along Principal
Component 2 denoted by λ2
3
 For each eigen value there is a
principal component. In other
words for each eigen value we
have a eigen vector
2
4.0 4.5 5.0 5.5 6.0
Our Goal
 Start with the variance and covariance matrix of the data set
 Obtain the value of maximum variance along each component which is the
Eigen Value
 Get the direction of principal components associated with the eigen vector
 Keep only those EIGEN VECTORS (2 or 3) that explain 80 to 90% of the variance
in data
 Use these 2-3 Eigen Vectors to transform the original data set into components
(face example)
 Naturally, all the variance in the data set has not been accounted for some
eigen vectors have been discarded because their contribution to the variance in
data was not significant = reduced dimesionality
Numerical Example: Understand the Data

Obs x1 x2 X1 and X2 are two variables and we have 10 observations


Our Goal is to extract the Principal Components of this data set
1 2.5 2.4
2 0.5 0.7 Step 1: Plot and evaluate the data points
3 2.2 2.9
Plot of X1 versus X2
4 1.9 2.2 3.5

5 3.1 3 3
Highly correlated
data r = 92.5%
6 2.3 2.7 2.5

2
7 2 1.6 1.5

8 1 1.1 1

9 1.5 1.6 0.5

10 1.1 0.9 0 0.5 1 1.5 2 2.5 3 3.5


Numerical Example: Center the Data
Step 2: Compute the means
𝑥1 = 1.81 and 𝑥2 = 1.91
Obs x1 x2
1 2.5 2.4 Step 3: Centre the data by subtracting the mean
2 0.5 0.7
3 2.2 2.9 Obs x1* = x1-𝑥1 x2* = x2-𝑥2
4 1.9 2.2 1 0.69 0.49 What is the mean of x1*
2 -1.31 -1.21 and x2*?
5 3.1 3 3 0.39 0.99
6 2.3 2.7 4 0.09 0.29 It is 0
7 2 1.6 5 1.29 1.09
6 0.49 0.79
8 1 1.1 7 0.19 -0.31
9 1.5 1.6 8 -0.81 -0.81
9 -0.31 -0.31
10 1.1 0.9 10 -0.71 -1.01
Numerical Example: Compare the Original Data
with Centered Data

Plot of X1 versus X2 Centered Data


3.5 1.5
3
2.5 1
2
1.5 0.5
1
0.5 0
-1.5 -1 -0.5 0 0.5 1 1.5
0
0 1 2 3 4 -0.5

-1
Center Point of Data
-1.5
Numerical Example: Compute the Variance and
Covariance of the Centered Data
Step 4: Using the Variance.S and Covariance.S function the variance and
covariance can be computed. This gives the Variance and Covariance matrix

Var (x1*) Cov(x1*,x2*) 0.616556 0.615444

Cov(x1*,x2*) Var(x2*) 0.615444 0.716556


Numerical Example: Compute the Eigen Values from
the Variance-Covariance Matrix
Step 5: From the variance and covariance matrix the eigen values can be obtained.
Recall the eigen values are the measures of maximum variance along a Principal
Component
5
Find the value of : λ1 λ2 2nd Principal 1st Principal
Component, y2 Component, y1

λ1 λ2
4

2
4.0 4.5 5.0 5.5 6.0
Numerical Example: Compute the Eigen Value

𝑨𝒗= 𝝀𝒗
𝝀 is a scalar and it has been converted into matrix
𝑨 𝒗 = 𝝀 𝑰𝒗 form by multiplying with the Identity Matrix 𝑰 so
that operations on matrices can be performed
𝒗 – the eigen vector cannot be a 0

(𝑨 − 𝝀 𝑰)𝒗 = 𝟎 vector because it has to give direction of


the principal component. So for the
equation to hold (𝑨 − 𝝀 𝑰) = 0

Det|𝑨 − 𝝀 𝑰| = 𝟎
Numerical Example: Compute the Eigen Value

Det|𝑨 − 𝝀 𝑰| = 𝟎
0.616556 0.615444 1 0

−𝝀 = 𝟎
0.615444 0.716556 0 1

0.616556 0.615444 𝝀 0

− 0 𝝀
= 𝟎
0.615444 0.716556
Numerical Example: Compute the Eigen Value

0.616556 0.615444 𝝀 0

− 0 𝝀
= 𝟎
0.615444 0.716556

0.616556 - λ 0.615444

= 𝟎
0.615444 0.716556 - λ

(𝟎. 𝟔𝟏𝟔𝟓𝟓𝟔 − 𝝀) ∗ (𝟎. 𝟕𝟏𝟔𝟓𝟓𝟔 − 𝝀) – 0.615444* 0.61544 = 0


(λ2– 1.333111 λ + 0.441796) – 0.378772 = 0
Numerical Example: Compute the Eigen Value

(𝟎. 𝟔𝟏𝟔𝟓𝟓𝟔 − 𝝀) ∗ (𝟎. 𝟕𝟏𝟔𝟓𝟓𝟔 − 𝝀) – 0.615444* 0.61544 = 0


(λ2 – 0.883593 λ + 0.441796) – 0.378772 = 0
λ2 – 0.883593 λ + 0.063024 = 0

This is a Quadratic Equation and will have two roots. The roots can
be obtained from the standard formula
−𝑏± 𝑏2 −4𝑎𝑐
λ1 ,λ2 = = 1.284028, 0.049083
2𝑎

For each eigen value there is a corresponding eigen vector which is also know as the
Principal Component
Numerical Example: Compute Eigen Vector
For λ1 = 1.284028
-0.66747*V11 + 0.61544*V12 = 0
(𝑨 − 𝝀𝟏 𝑰)𝒗 = 𝟎 0.615444*V11 – 0.56747*V12 = 0

V11 = .92205 V12


V11 = .92205 V12
0.616556 - λ1 0.615444 V11
=𝟎 If V12 = 1, then V11 = .92205
0.615444 0.716556 – λ1
V12

-0.66747 0.615444 V11 V11 0.92205


V1
=𝟎 V12 1
0.615444 -0.56747 V12
Numerical Example: Eigen Vector
An Eigen Vector should be a UNIT Vector that is the magnitude of this vector should be 1

The magnitude of this vector is 𝑽𝟐𝟏𝟏 + 𝑽𝟐𝟏𝟐 = 1.850176 > 1

We need to SCALE each value obtained by 1.850176

Check the
V11 0.92205 / 1.850176 0.677874
magnitude of
V1 the vector and
V12 1 / 1.850176 0.735179
it will be
approx. 1

This eigen vector corresponds to the first eigen value of 1.284028


Numerical Example: Compute Eigen Vector
For λ2 = 0.049083
0.567472*V21 + 0.61544*V22 = 0
(𝑨 − 𝝀𝟐 𝑰)𝒗 = 𝟎 0.615444*V21 + 0.667472*V22 = 0

V22 = -0.922053 V21


V22 = -0.922053 V21
0.616556 - λ2 0.615444 V21
=𝟎 If V22 = -1, then V21 = +1.084537
0.615444 0.716556 – λ2
V22

0.567472 0.615444 V21 V21 1


V2
=𝟎 V22 -0.922053
0.615444 0.667472 V22
Numerical Example: Eigen Vector
An Eigen Vector should be a UNIT Vector that is the magnitude of this vector should be 1

The magnitude of this vector is 𝑽𝟐𝟏𝟏 + 𝑽𝟐𝟏𝟐 = 1.360214 > 1

We need to SCALE each value obtained by 1.360214

Check the
V21 1/ 1.360214 0.735179
magnitude of
V2 the vector and
V22 -0.922053 / 1.360214 -0.677874
it will be
approx. 1

This eigen vector corresponds to the second eigen value of 0.049083


Numerical Example: Analysis of Eigen Vector and
Eigen Value
 The total variance is 100% which is λ1 + λ2
 The first eigen vector or First Principal Component captures 96.31% of variance. This is
computed from λ1 / λ1 + λ2
 The second eigen vector or Second Principal Component captures 3.69% of variance.
This is computed from computed from λ2 / λ1 + λ2 Centered Data
1.5
V1 V2
1

0.677874 0.735179 0.5

V 0
0.735179 -0.677874
-1.5 -1 -0.5 0 0.5 1 1.5

-0.5

Λ1 = 1.284028 Λ2 = 0.049083
-1

-1.5
Numerical Example: Transformation of Data
What does Transformation of Data Imply?

The data point (2.5,2.5) is based on the x1-x2


X2 coordinates
PC1
(2.5 , 2.5)
2.5 This data point has to be re-positioned as per
PC1 PC1 and PC2
Y2
Y1
Therefore, we have to find Y1 and Y2

To do this we take the data matrix and multiply


it by the vector (v)

This is called transformation and all data points


are now referenced to PC1 and PC2
2.5 X1
Numerical Example: Transformation of Data
X1 X2 Y1 Y2
0.69 0.49 0.828 0.175
-1.31 -1.21 -1.778 -0.143
0.39 0.99 0.992 -0.384
0.09 0.29 V1 V1 0.274 -0.130
1.29 1.09 1.676 0.209
0.49 0.79 0.677873 0.735179 0.913 -0.175
0.19 -0.31 0.735179 -0.677873 -0.099 0.350
-0.81 -0.81 -1.145 -0.046
-0.31 -0.31 -0.438 -0.018
-0.71 -1.01 -1.224 0.163

Mean Adjusted Data Eigen Vectors Transformed Data


Numerical Example: Transformed Data

Transformed data
Plot of X1 versus X2
3.5 0.400

0.300
3

0.200
2.5
0.100
2
0.000
-2.000 -1.500 -1.000 -0.500 0.000 0.500 1.000 1.500 2.000
1.5
-0.100

1
-0.200

0.5
-0.300

0 -0.400
0 0.5 1 1.5 2 2.5 3 3.5
-0.500
Numerical Example: Analysis of Eigen Vector and
Eigen Value
 The total variance is 100% which is λ1 + λ2
 The first eigen vector or First Principal Component captures 96.31% of variance. This is
computed from λ1 / λ1 + λ2
 The second eigen vector or Second Principal Component captures 3.69% of variance.
This is computed from computed from λ2 / λ1 + λ2
 The first principal component explains 96.31% of the variance. The second principal
component can be dropped without losing much information

V1 V2 V1

0.677874 0.735179 0.677874


V V
0.735179 -0.677874 0.735179

Λ1 = 1.284028 Λ2 = 0.049083 Λ1 = 1.284028


Numerical Example: Dimension Reduction
X1 X2 Y1
0.69 0.49 0.828
-1.31 -1.21 -1.778
0.39 0.99 0.992
0.09 0.29 V1 0.274
1.29 1.09 1.676
0.49 0.79 0.677873 0.913
0.19 -0.31 0.735179 -0.099
-0.81 -0.81 -1.145
-0.31 -0.31 -0.438
-0.71 -1.01 -1.224

Mean Adjusted Data Eigen Vectors Transformed Data


Numerical Example: Dimension Reduction

Plot of X1 versus X2
3.5

3 0.828
-1.778 Group 1 Group 2
2.5
0.992
2 0.274
1.676 -
1.5 0.913 1.778 0 1.676
-0.099
1
-1.145
Reduced from 2 Dimensional to one
0.5 -0.438
Dimension
-1.224
0
0 0.5 1 1.5 2 2.5 3 3.5
PCA Model
X1 X2 V1 V2 Y1 Y2
x11 x12
x21 x22 v11 v12 x11 v11 + x12 v21 x11 v12 + x12 v22
x31 x32 v21 v22 x21 v11 + x22v21 x21 v12 + x22v22
x31 v11 + x32v21 x31 v12 + x32v22

Mean Adjusted Data Eigen Vectors Transformed Data

Suppose that X1 is Income and X2 is Age

PC1: Y1 = V11 * X1 + V21 * X2 PC1: Y1 = V11 * Income + V21 * Age


PC2: Y2 = V12 * X1 + V22 * X2 PC2: Y2 = V12 * Income + V22 * Age

Potrebbero piacerti anche